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Description 

FIELD OF THE INVENTION 

5 [0001] The present invention relates to the production of recombinant proteins using yeast host cells as the expres- 
sion system. More particularly, it relates to compositions and methods for expression of heterologous proteins and 
their secretion as the biologically active mature proteins. 

BACKGROUND OF THE INVENTION 

10 

[0002] Yeast host expression systems have been used to express and secrete proteins foreign to yeast. Numerous 
approaches have been developed in terms of the degree of expression and the yield of biologically active mature 
proteins. 

[0003] Such approaches have involved modifications to the various molecular components that are involved in ex- 
15 pression and secretion of proteins in yeast. These components include the translation and termination regulatory re- 
gions for gene expression; signal peptide and secretion leader peptide sequences, which direct the precursor form of 
the heterologous protein through the yeast secretory pathway; and processing sites, which cleave leader peptide se- 
quences from the polypeptide sequence of the protein of interest. 

[0004] Expression of the protein of interest can be enhanced with use of yeast-recognized regulatory regions. In- 
20 creased yield of the heterologous protein of interest is commonly achieved with the use of yeast-derived signal and 
secretion leader peptide sequences. The use of native signal-leader peptide sequences is believed to improve direction 
of the protein of interest through the secretory pathway of the yeast host. 

[0005] Previous work has demonstrated that full-length yeast cc-factor signal-leader sequences can be used to drive 
expression and processing of heterologous proteins in yeast host cells. Substantial improvements in efficiency of ex- 
25 pression can be accomplished with the use of truncated cc-factor leader sequences, particularly for heterologous pro- 
teins that are poorly expressed by the full-length sequence or whose expression is nonresponsive to the full-length 
sequence. 

[0006] Although the various approaches available in the art have been shown to work with some proteins, problems 
persist with post-translational processing. Often the amount of protein secreted is unacceptably low or incorrect 
30 processing leads to inactive forms of the protein. This is particularly true for proteins that are initially expressed as a 
precursor polypeptide sequence and whose assumption of a native conformation is facilitated by the presence of a 
native propeptide sequence in the precursor polypeptide. 

[0007] Methods for expression of heterologous proteins and their secretion in a biologically active mature Form using 
a yeast host cell as the expression system are needed. 

35 

SUMMARY OF THE INVENTION 

[0008] The present invention provides nucleotide sequences encoding a signal sequence for a yeast secreted pro- 
tein, a native N-terminal or C-terminal propeptide sequence of a mature protein of interest, and a peptide sequence 
40 for the mature protein of interest. Each of these elements is associated with a processing site recognized in vivo by a 
yeast proteolytic enzyme. Any or all of these processing sites may be a preferred processing site that has been modified 
or synthetically derived for more efficient cleavage in vivo. In turn, all of these elements are operably linked to a yeast 
promoter and optionally other regulatory sequences. 

[0009] The nucleotide coding sequences of these compositions may additionally comprise a leader peptide sequence 
45 for a yeast secreted protein. When present, this element, which is also associated with a processing site recognized 
in vivo by a yeast proteolytic enzyme, is positioned 3' to the yeast signal sequence and 5' to the sequence for the 
mature protein of interest. Thus cleavage by a yeast proteolytic enzyme removes the yeast leader sequence from the 
hybrid precursor molecule comprising the sequence for the mature protein of interest. 

[0010] These compositions are useful in methods for expression of heterologous mammalian proteins and their se- 
50 cretion in the biologically active mature form. Therefore the invention also provides vectors comprising the nucleotide 
sequences of the invention and yeast host cells stably transformed with a nucleotide sequence of the invention. Such 
cells can then be cultured and screened for secretion of the biologically active mature protein of interest. 
[0011] The invention also provides a method for expression of heterologous proteins and their secretion in the bio- 
logically active mature form using a yeast host cell as the expression system, said method comprising transforming 
55 said yeast cell with a vector comprising a nucleotide sequence of the invention. 

[0012] The method of the present invention is particularly useful in production of mammalian proteins whose as- 
sumption of a native confirmation is facilitated by the presence of a native propeptide sequence in the precursor 
polypeptide. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0013] 

5 Figure 1 is a map of plasmid pAB24. 

Figure 2 is a map of the rhPDGF-B expression cassette in pAGL7PB and pYAGL7PB. 

Figure 3 is a map of rhPDGF-B expression plasmid pYAGL7PB. 

Figure 4 is a map of the rhPDGF-B expression cassette in pL7PPB and pYL7PPB. 

Figure 5 shows the final steps in the construction of the rhPDGF-B expression cassette in pL7PPB. 
10 Figure 6 is a map of rhPDGF-B expression plasmid pYL7PPB. 

DETAILED DESCRIPTION OF THE INVENTION 

[0014] The present invention provides compositions and methods for expression of heterologous proteins of interest, 
15 more particularly heterologous mammalian proteins, and their secretion in a biologically active mature form using a 
yeast host cell as the expression system. By "biologically active mature form" is intended a protein whose conforma- 
tional form is similar to the native conformation such that its biological activity is substantially the same as the biological 
activity of the native protein. 

[001 5] Compositions of the present invention are nucleotide sequences encoding hybrid precursor polypeptides that 

20 each comprise the polypeptide sequence for a mature heterologous protein of interest. Expression vectors comprising 
these nucleotide sequences, all under the operational control of a yeast promoter region and a yeast terminator region, 
are also provided. Methods of the invention comprise stably transforming a yeast host cell with said vectors, where 
expression of the nucleotide sequence encoding the hybrid precursor polypeptide leads to secretion of the mature 
heterologous protein-of interest in a biologically active form. 

25 [001 6] By "heterologous protein of interest" is intended a protein that is not expressed by the yeast host cell in nature. 
Preferably the heterologous protein will be a mammalian protein, including substantially homologous and functionally 
equivalent variants thereof. By "variant" is intended a polypeptide derived from the native polypeptide by deletion (so- 
called truncation) or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; 
deletion or addition of one or more amino acids at one or more sites in the native polypeptide; or substitution of one 

30 or more amino acids at one or more sites in the native polypeptide. Such variants may result from, for example, genetic 
polymorphism or from human manipulation. Methods for such manipulations are generally known in the art. 
[0017] For example, amino acid sequence variants of the polypeptide can be prepared by mutations in the cloned 
DNA sequence encoding the native polypeptide of interest. Methods for mutagenesis and nucleptide sequence alter- 
ations are well known in the art. See, for example, Walker and Gaastra, eds. (1983) Techniques in Molecular Biology 

35 (MacMillan Publishing Company, New York); Kunkel (1985) Proc. Natl. Acad. Sci. USA 82:488-492; Kunkel etal. (1987) 
Methods Enzymol. 154:367-382; Sambrook etal. (1989) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, 
New York); U.S. Patent No. 4,873,192; and the references cited therein. Guidance as to appropriate amino acid sub- 
stitutions that do not affect biological activity of the protein of interest may be found in the model of Dayhoff et al. (1 978) 
in Atlas of Protein Sequence and Structure (Natl. Biomed. Res. Found., Washington, D.C.), herein incorporated by 

40 reference. Conservative substitutions, such as exchanging one amino acid with another having similar properties, may 
be preferred. Examples of conservative substitutions include, but are not limited to, Gly^=>Ala, Val<^=>lle<=>Leu, Asp<^>Glu, 
|_ys<^>Arg, Asn<^>Gln, and Phe<^=>Trp<^>Tyr. 

[0018] In constructing variants of the protein of interest, modifications will be made such that variants continue to 
possess the desired activity. Obviously, any mutations made in the DNA encoding the variant protein must not place 
45 the sequence out of reading frame and preferably will not create complementary regions that could produce secondary 
mRNA structure. See EP Patent Application Publication No. 75,444. 

[0019] Thus proteins of the invention include the naturally occurring forms as well as variants thereof. These variants 
will be substantially homologous and functionally equivalent to the native protein. A variant of a native protein is "sub- 
stantially homologous" to the native protein when at least about 80%, more preferably at least about 90%, and most 

50 preferably at least about 95% of its amino acid sequence is identical to the amino acid sequence of the native protein. 
A variant may differ by as few as 1 , 2, 3, or 4 amino acids. By "functionally equivalent" is intended that the sequence 
of the variant defines a chain that produces a protein having substantially the same biological activity as the native 
protein of interest. Such functionally equivalent variants that comprise substantial sequence variations are also en- 
compassed by the invention. Thus a functionally equivalent variant of the native protein will have a sufficient biological 

55 activity to be therapeutically useful. By "therapeutically useful" is intended effective in achieving a therapeutic goal, as, 
for example, healing a wound. 

[0020] Methods are available in the art for determining functional equivalence. Biological activity can be measured 
using assays specifically designed for measuring activity of the native protein, including assays described in the present 
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invention. Additionally, antibodies raised against the biologically active native protein can be tested for their ability to 
bind to the functionally equivalent variant, where effective binding is indicative of a protein having a conformation similar 
to that of the native protein. 

[0021] The nucleotide sequences encoding the mature heterologous proteins of interest can be sequences cloned 
5 from non-yeast organisms, or they may be synthetically derived sequences, usually prepared using yeast-preferred 
codons. Examples of heterologous proteins suitable for the invention include, but are not limited to transforming growth 
factor (TGF-alpha and TGF-beta), somatostatin (as in SRIF 1), parathyroid hormone, and more particularly platelet- 
derived growth factor (PDGF) and insulin growth factor (IGF), all of which have a native prosequence as part of the 
precursor protein. 

w [0022] Thus compositions of the present invention are nucleotide sequences comprising in the 5' to 3' direction and 
operably linked (a) a yeast recognized transcription and translation initiation region, (b) a coding sequence for a hybrid 
precursor polypeptide, and (c) a yeast-recognized transcription and translation termination region, wherein said hybrid 
precursor polypeptide comprises: 

5'-SP-(PS) n . 1 (LP-PS) n . 2 .(NPRO MHp -PS) n . 3 .MHP-(PS-CPRO MH p) n . 4 .3' 

wherein: 

20 SP comprises a signal peptide sequence for a yeast secreted protein; 

PS comprises a processing site cleaved in vivo by a yeast proteolytic enzyme; 
LP comprises a leader peptide sequence for a yeast secreted protein; 

NPRO MHP comprises a native N-terminal propeptide sequence of a mature heterologous protein of interest; 
MHP comprises a peptide sequence for said mature heterologous mammalian protein of interest; 
25 CPRO MHP comprises a native C-terminal propeptide sequence of said mature heterologous mammalian protein 

of interest; and 

n-1 , n-2, n-3, and n-4 independently = 0 or 1 ; 
wherein said processing sites allow for proteolytic processing of said precursor polypeptide to said mature protein in 

30 vivo by a yeast host cell, and wherein at least n-3 or n-4 = 1 . 

[0023] As is the case for the heterologous protein of interest, each of the other elements present in the hybrid pre- 
cursor polypeptide can be a known naturally occurring polypeptide sequence or can be synthetically derived, including 
any variants thereof that do not adversely affect the function of the element as described herein. By "adversely affect" 
is intended inclusion of the variant form of the element results in decreased yield of the secreted mature heterologous 

35 protein of interest relative to the hybrid precursor polypeptide comprising the native form of the element. 

[0024] In constructing the nucleotide sequence encoding the hybrid precursor polypeptide, it is within skill in the art 
to employ adapters or linkers to join the nucleotide fragments encoding the various elements of the precursor polypep- 
tide. See, for example, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New 
York). Thus, the hybrid precursor polypeptide may comprise additional elements positioned 5' or 3' to any of the primary 

40 elements listed above, including the yeast leader peptide sequence and its associated yeast-recognized processing 
site when present. 

[0025] For purposes of the present invention, SP is a presequence that is an N-terminal sequence for the precursor 
polypeptide of the mature form of a yeast secreted protein. When the nucleotide sequence encoding the hybrid pre- 
cursor polypeptide is expressed in a transformed yeast host cell, the signal peptide sequence functions to direct the 
45 hybrid precursor polypeptide comprising the mature heterologous protein of interest into the endoplasmic reticulum 
(ER). Movement into the lumen of the ER represents the initial step into the secretory pathway of the yeast host cell. 
Although the signal peptide of the invention can be heterologous to the yeast host cell, more preferably the signal 
peptide will be native to the host cell. 

[0026] The signal peptide sequence of the invention may be a known naturally occurring signal sequence or any 
50 variant thereof as described above that does not adversely affect the function of the signal peptide. Examples of signal 
peptides appropriate for the present invention include, but are not limited to, the signal peptide sequences for oc-factor 
(see, for example, U.S. Patent No. 5,602,034; Brake et al, (1984) Proc. Natl, Acad. Sci. USA 81 :4642-4646); invertase 
(WO 84/01153); PH05 (DK 3614/83); YAP3 (yeast aspartic protease 3; PCT Publication No. 95/02059); and BAR1 
(PCT Publication No. 87/02670). Alternatively, the signal peptide sequence may be determined from genomic or cDNA 
55 libraries using hybridization probe techniques available in the art (see Sambrook et al. (1989) Molecular Cloning: A 
Laboratory Manual (Cold Spring Harbor, New York), or even synthetically derived (see, for example, WO 92/11378). 
[0027] During entry into the ER, the signal peptide is cleaved off the precursor polypeptide at a processing site. The 
processing site can comprise any peptide sequence that is recognized in vivo by a yeast proteolytic enzyme. This 
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processing site may be the naturally occurring processing site for the signal peptide. More preferably, the naturally 
occurring processing site will be modified, or the processing site will be synthetically derived, so as to be a preferred 
processing site. By "preferred processing site" is intended a processing site that is cleaved in vivo by a yeast proteolytic 
enzyme more efficiently than is the naturally occurring site. Examples of preferred processing sites include, but are 

5 not limited to, dibasic peptides, particularly any combination of the two basic residues Lys and Arg, that is Lys-Lys, 
Lys-Arg, Arg-Lys, or Arg-Arg, most preferably Lys-Arg. These sites are cleaved by the endopeptidase encoded by the 
KEX2 gene of Saccharomyces cerevisiae (see Fuller et al. Microbiology 1 986:273-278) or the equivalent protease of 
other yeast species (see Julius et al (1 983) Cell 32:839-852). In the event that the KEX2 endopeptidase would cleave 
a site within the peptide sequence for the mature heterologous protein of interest, other preferred processing sites 

10 could be utilized such that the peptide sequence of interest remains intact (see, for example, Sambrook et al. (1 989) 
Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York). 

[0028] A functional signal peptide sequence is essential to bring about extracellular secretion of a heterolegous 
protein from a yeast cell. Additionally, the hybrid precursor polypeptide may comprise a secretion leader peptide se- 
quence of a yeast secreted protein to further facilitate this secretion process. When present, the leader peptide se- 

15 quence is generally positioned immediately 3' to the signal peptide sequence processiug site. By "secretion leader 
peptide sequence" (LP) is intended a peptide that directs movement of a precursor polypeptide, which for the purposes 
of this invention is the hybrid precursor polypetide comprising the mature heterologous protein to be secreted, from 
the ER to the Golgi apparatus and from there to a secretory vesicle for secretion across the cell membrane into the 
cell wall area and/or the growth medium. The leader peptide sequence may be native or heterologous to the yeast 

20 host cell but more preferably is native to the host cell. 

[0029] The leader peptide sequence of the present invention may be a naturally occurring sequence for the same 
yeast secreted protein that served as the source of the signal peptide sequence, a naturally occurring sequence for a 
different yeast secreted protein, or a synthetic sequence (see, for example, WO 92/11378 and WO 95/02059), or any 
variants thereof that do not adversely affect the function of the leader peptide. 

25 [0030] For purposes of the invention, the leader peptide sequence when present is preferably derived from the same 
yeast secreted protein that served as the source of the signal peptide sequence, more preferably an a-factor protein. 
A number of genes encoding precursor oc-factor proteins have been cloned and their combined signal-leader peptide 
sequences identified. See, for example, Singh et al. (1983) Nucleic Acids Res. 11:4049-4063; Kurjan etal., U.S. Patent 
No. 4,546,082; U.S. Patent No. 5,010,182. Alpha-factor signal-leader peptide sequences have been used to express 

30 heterologous proteins in yeast. See, for example, Elliott etal. (1983) Proc, Natl. Acad. Sci. USA 80:7080-7084; Bitter 
etal. (1984) Proc. Natl. Acad. Sci. 81:5330-5334; Smith etal. (1985) Science 229:121 9-1229; WO 95/02059; U.S. 
Patent Nos. 4,849,407 and 5,219,759. 

[0031] Alpha-factor, an oligopeptide mating pheromone approximately 13 residues in length, is produced from a 
larger precursor polypeptide of between about 1 00 and 200 residues in length, more typically about 1 20-1 60 residues. 

35 This precursor polypeptide comprises the signal sequence, which is about 19-23 (more typically 20-22 residues), the 
leader sequence, which is about 60 residues, and typically 2-6 tandem repeats of the mature pheromone sequence. 
Although the signal psptide sequence and full-length a-factor leader peptide sequence can be used, more preferably 
for this invention a truncated a-factor leader peptide sequence will be used with the signal peptide when both elements 
are present in the hybrid precursor molecule. 

40 [0032] By "truncated" a-factor leader peptide sequence is intended a portion of the full-length a-factor leader peptide 
sequence that is about 20 to about 60 amino acid residues, preferably about 25 to about 50 residues, more preferably 
about 30 to about 40 residues in length. Methods for using truncated a-factor leader sequences to direct secretion of 
heterologous proteins in yeast are known in the art. See particularly U.S. Patent No. 5,602,034. When the hybrid 
precursor polypeptide sequence comprises a truncated a-factor leader peptide, deletions to the full-length leader will 

45 preferably be from the C-terminal end and will be done in such a way as to retain at least one glycosylation site (-Asn- 
Y-Thr/Ser-, where Y is any amino acid residue) in the truncated peptide sequence. This glycosylation site, whose 
modification is within skill in the art, is retained to facilitate secretion (see particularly WO 89/02463). 
[0033] When the hybrid precursor polypeptide sequence of the present invention comprises a leader peptide se- 
quence, such as the a-factor leader sequence, there will be a processing site immediately adjacent to the 3' end of 

50 the leader peptide sequence. This processing site enables a proteolytic enzyme native to the yeast host cell to cleave 
the yeast secretion leader peptide sequence from the 5' end of the native N-terminal propeptide sequence of the mature 
heterologous protein of interest, when present, or from the 5' end of the peptide sequence for the mature heterologous 
protein of interest. The processing site can comprise any peptide sequence that is recognized in vivo by a yeast pro- 
teolytic enzyme such that the mature heterologous protein of interest can be processed correctly. The peptide sequence 

55 for this processing site may be a naturally occurring peptide sequence for the native processing site of the leader 
peptide sequence. More preferably, the naturally occurring processing site will be modified, or the processing site will 
be synthetically derived, so as to be a preferred processing site as described above. 

[0034] In the present invention, the nucleotide sequence encoding the hybrid precursor polypeptide comprises a 



5 



EP 0 946 736 B1 



native propeptide sequence (PRO MHP ) for the mature heterologous protein of interest. By "native propeptide sequence" 
or "native prosequence" is intended that portion of an intermediate precursor polypeptide (which is called a "pro-pro- 
tein") for a mature secreted protein that remains attached to the N-terminal and/or C-terminal end of the mature protein 
sequence following cleavage of the native signal peptide sequence (or presequence) from the initial precursor polypep- 
5 tide (or "prepro-protein"). The residues of the propeptide sequence are not contained in the mature secreted protein. 
Rather, such extra residues are removed at processing sites by proteolytic enzymes near the end of the secretory 
pathway, in the trans-Golgi network (Griffiths and Simons (1986) Science 234:438-443) and secretory granules (Orci 
etal. (1986) J. Cell Biol. 103:2273-2281). 

[0035] The present invention provides for the presence of propeptide sequences that naturally occur at the N-terminal 

10 and/or C-terminal end of the native pro-protein precursor form of the mature heterologous protein of interest. Thus, a 
propeptide sequence may be positioned between the 3' end of the signal peptide sequence processing site, or the 3' 
end of the yeast-recognized processing site adjacent to the leader peptide sequence if present, and the 5' end of the 
peptide sequence for the mature heterologous protein of interest (an N-terminal propeptide sequence, PRO MHP ) or 
immediately adjacent to the 3' end of the peptide sequence for the mature heterologous protein of interest (a C-terminal 

15 propeptide sequence, CPRO MHP ), depending on its orientation within the native pro-protein. The invention also pro- 
vides for inclusion of both an N-terminal and a C-terminal propeptide sequence flanking the peptide sequence for the 
mature heterologous protein of interest when both propeptide sequences exist in the native pro-protein. Where both 
an N-terminal and a C-terminal propeptide sequence exists in the native pro-protein, preference for inclusion of both 
propeptide sequences in the hybrid precursor polypeptide will be experimentally determined. 

20 [0036] Methods are available in the art for determining the naturally occurring processing sites for the native signal 
peptide and propeptide sequences of a prepro-protein (see, for example, von Heijne (1983) Eur. J. Biochem. 133: 
1 7-21 , (1 984) J. Mol, Biol. 1 73: 243-251 , (1 986) J. Mol. Biol. 1 84:99-1 05, and (1 986) Nucleic Acids Res. 1 4:4683-4690) 
such that the native N-terminal and/or C-terminal propeptide sequence can be determined for use in the invention. 
[0037] Immediately 3' to the native N-terminal propeptide sequence (when present) or immediately 5' to the C-ter- 

25 minal propeptide sequence (when present) is a processing site that is recognized in vivo by a yeast proteolytic enzyme. 
This processing site allows for cleavage of the propeptide sequence from the peptide sequence for the mature heter- 
ologous protein of interest (MHP). It is recognized that this processing site may be the naturally occurring processing 
site for the propeptide sequence if the naturally occurring site is recognized in vivo by a proteolytic enzyme of the yeast 
host cell. More preferably, the naturally occurring processing site will be modified, or the processing site will be syn- 

30 thetically derived, so as to be a preferred processing site. Examples of preferred processing sites include, but are not 
limited to, those discussed above for the other processing. Preferably all of these processing sites will be similar such 
that the same yeast proteolytic enzyme brings about cleavage of the signal and leader peptide sequences and the 
native propeptide sequence(s). 

[0038] In accordance with the invention as stated above, the yeast signal peptide and secretion leader peptide se- 
35 quences, as well as the native propeptide sequences, represent those parts of the hybrid precursor polypeptide of the 
invention that can direct the sequence for the mature heterologous protein of interest through the secretory pathway 
of a yeast host cell. 

[0039] In one preferred embodiment of the present invention, the nucleotide sequence of the hybrid precursor 
polypeptide comprises in the 5' to 3' direction: 

40 

5 , -AFSP-tAFLP-PS L -NPROp DGF -PS N p RO -Mp DGF -3 , 

wherein: 

45 

AFSP comprises an a-factor signal peptide sequence and a processing site; 
tAFLP comprises a truncated a-factor secretion leader peptide sequence; 
PS L comprises a preferred processing site for the leader peptide sequence; 

NPRO PDGF comprises the peptide sequence for a native N-terminal propeptide of a mature platelet-derived growth 
50 factor (PDGF); 

PS NPRO comprises a preferred processing site for the N-terminal propeptide sequence; and 
M PDGF comprises the sequence for said mature PDGF 

[0040] Preferably the a-factor signal peptide and truncated a-factor secretion leader peptide sequences are derived 
55 from the Mata gene of S. cerevisiae as outlined in the examples. The preferred truncated a-factor leader peptide 
sequence will include the N-terminal portion of the full-length leader sequence; that is, the leader sequence will start 
with the first amino acid residue of the full-length sequence and run the length of about 20 to about 60 amino acid 
residues, preferably about 25 to about 50 residues, more preferably about 30 to about 40 residues. In one embodiment, 
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a leader of about 35 residues is used. 

[0041] The mature protein of this preferred embodiment is human platelet-derived growth factor (PDGF). PDGF, the 
primary mitogen in serum for mesenchymal-derived cells, is stored in platelet alpha-granules. Injury to blood vessels 
activates the release of PDCF from these granules in the vicinity of the injured vessels. This mitogen acts as a potent 

5 chemoattractant for fibroblasts and smooth muscle cells, as well as monocytes and neutrophils. The mitogenic activity 
of the localized PDGF results in proliferation of these cells at the site of injury, contributing to the process of wound repair. 
[0042] Purified native platelet-derived growth factor (PDGF), a glycoprotein of about 30,000 dalions, is composed 
of two disulfide-linked polypeptide chains. Two forms of these chains, designated A and B, have been identified. The 
native protein occurs as the homodimer AA or BB or the heterodimer AB, or a mixture thereof. A partial amino acid 

10 sequence for the PDGF-A chain has been identified (Johnsson ef a/.(1984) EMBOJ. 3:921-928) and cDNAs encoding 
two forms of PDGF A-chain precursors have been described (U.S. Patent No. 5,219,759). The A-chain is derived by 
proteolytic processing of a 211 amino acid precursor polypeptide. The cDNA encoding the PDGF-B chain has also 
been described (Nature (1985) 316:748-750). The B-chain is derived by proteolytic processing of a 241 amino acid 
precursor. 

15 [0043] The mature PDGF protein of the present invention will be the biologically active dimeric form, including the 
homodimers PDGF-AA and PDGF-BB or the heterodimer PDGF-AB, and any substantially homologous and function- 
ally equivalent variants thereof as defined above. For example, the native amino acid sequence for the A-chain or the 
B-chain may be truncated at either the N-terminal or C-terminal end. Thus removal of up to 1 5 or up to 1 0 amino acids 
from the N-terminal or C-terminal end, respectively, of the B-chain does not affect biological activity of the variant. 

20 Additionally, amino-acid substitutions may be made. For example, an amino acid such as serine may be substituted 
for any of the cysteine residues at positions 43, 52, 53, and 97 of the native human B-chain and at corresponding 
positions in the native A-chain to obtain substantially homologous and functionally equivalent variants of the native 
chain. Variants of the A-chain are known based on cloned DNA sequences, such as, for example, variants having an 
additional 6 or 1 9 amino acids at the C-terminal end. See, for example, Tong etal. (1987) Nature 328:6 19-621 ; Betsholtz 

25 et af. (1986) Nature 320:695-699. One PDGF B-chain variant may be the corresponding substantially homologous 
portion of the amino-acid sequence encoded by the v-sis gene of simian sarcoma virus. The homologous region of the 
product of this gene, p28 sis , begins at amino acid 67 and continues to amino acid 1 75, and differs from the human B- 
chain by only 4 amino acid residues (see, for example, European Patent Application No. 0 487 116 A1). Functionally 
equivalent variants can be determined with assays for biological activity as described in the examples. 

30 [0044] The nucleotide sequence encoding the mature PDGF protein of the present invention may be genomic, cDNA, 
or synthetic DNA. The genes encoding the native forms of PDGF have been sequenced, and several variants are well 
known in the art. Expression of PDGF homodimers and heterodimers is described in, for example, U.S. Patent Nos. 
4,766,073; 4,769,328; 4,801,542; 4,845,075; 4,849,407; 5,045,633; 5,128,321; and 5,187,263. Based on the known 
amino acid sequences for the A- and B-chain polypeptides, synthetic nucleotide sequences encoding PDGF A-chain 

35 and B-chain polypeptides may be made in vitro using methods available in the art. See particularly Sambrook ef al. 
(1989) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York). Where the mature protein of interest 
is the heterodimer PDGF-AB, the nucleotide sequences encoding the hybrid precursor polypeptides comprising the 
A- and B-chain polypeptides may be assembled as part of one expression cassette or assembled into separate ex- 
pression cassettes for cotransformation of a yeast host cell. 

40 [0045] In this preferred embodiment comprising mature PDG-F, the C-terminal end of the truncated cc-factor secretion 
leader peptide sequence and of the native N-terminal propeptide sequence will terminate in a preferred processing 
site, preferably a dibasic processing site that is specific for the KEX2 endopeptidase of S. cerevisiae. The dipeptides 
can be any combination of the basic residues Lys and Arg, more preferably a Lys-Arg dipeptide. 
[0046] The native prepro-PDGF-B additionally comprises a 51 amino acid C-terminal propeptide. In another preferred 

45 embodiment, the nucleotide sequence encoding the hybrid precursor polypeptide comprises in the 5' to 3' direction 
the following modified sequence: 

5'-AFSP-tAFLP-PS L -NPROp DGF -PS NPRO -Mp DGF -PS C p RO -CPROp DGF -3' 

50 

wherein: 

CPROp DGF comprises a C-terminal propeptide sequence for said PDGF mature heterologous protein of interest; 
and 

55 PS CPR q comprises a preferred processing site for the C-terminal propeptide sequence. 

[0047] Preferably the preferred processing site for the C-terminal propeptide sequence is similar to that of the leader 
peptide sequence and the N-terminal propeptide sequence, such that the same yeast proteolytic enzyme brings about 
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cleavage of the a-factor leader peptide sequence and the sequences for both of the native propeptides. Inclusion of 
these two additional components is experimentally determined. 

[0048] In another preferred embodiment of the invention, the nucleotide sequence of the hybrid precursor polypeptide 
comprises in the 5' to 3' direction: 

5 

5'-AFSP-AFLP-PS L -M |GF -PS C p RO -CPRO |GF -3' 

wherein: 

10 

AFSP comprises an a-factor signal peptide sequence and a processing site; 
AFLP comprises an a-factor secretion leader peptide sequence 
PS L comprises a preferred processing site for the leader peptide sequence; 
M| GF comprises the peptide sequence for a mature insulin-like growth factor (IGF); 
15 PS CPRO comprises a preferred processing site for the C-terminal propeptide sequence; and 

CPRO| GF comprises the peptide sequence for a native C-terminal propeptide of said mature IGF. 

[0049] Preferably the a-factor signal peptide and a-factor secretion leader peptide sequences are derived from the 
Mata gene of S. cerevisiae as outlined for the preferred embodiment for PDGF. 

20 [0050] The mature protein of this preferred embodiment is insulin-like growth factor (IGF), more particularly IGF-I. 
Insulin-like growth factor (IGF-I) belongs to a family of polypeptides known as somatomedins. IGF-I stimulates growth 
and division of a variety of cell types, particularly during development. See, for example, European Patent Application 
Nos. 560,723 A and 436,469 B. Thus, processes such as skeletal growth and cell replication are affected by IGF-I levels. 
[0051] IGF-I is structurally and functionally similar to, but antigenically distinct from, insulin. In this regard, IGF-I is a 

25 single-chain polypeptide with three intrachain disulfide bridges and four domains known as the A, B, C, and D domains, 
respectively. The A and B domains are connected by the C domain, and are homologous to the corresponding domains 
of proinsulin. The D domain, a C-terminal prosequence, is present in IGF-I but is absent from proinsulin. IGF-I has 70 
amino acid residues and a molecular mass of approximately 7.5 kDa. See Rinderknecht (1978) J. Biol. Chem.. 253: 
2769 and FEBS Lett. 89:283. For a review of IGF, see Humbel (1990) Eur. J. Biochem. 190:445-462. 

30 [0052] The mature IGF protein of the present invention will be the biologically active form and any substantially 
homologous and functionally equivalent variants thereof as defined above. Functionally equivalent variants can be 
determined with assays for biological activity, including the assay, as described in the examples. Representative assays 
include known radioreceptor assays using placental membranes (see, for example, U.S. Patent No. 5,324,639; Hall 
etal. (1974) J. Clin. Endocrinol, and Met ab. 39:973-976; and Marshall etal. (1974) J. Clin. Endocrinol, and Metab. 39: 

35 283-292), a bioassay that measures the ability of the molecule to enhance incorporation of tritiated thymidine, in a 
dose-dependent manner, into the DNA of BALB/c 3T3 fibroblasts (see, for example, Tamura etal. (1 989) J. Biol. Chem. 
262:5616-5621), and the like; herein incorporated by reference. 

[0053] The art provides substantial guidance regarding the preparation and use of IGF-I variants. For example, frag- 
ment of IGF-I will generally include at least about 1 0 contiguous amino acid residues of the full-length molecule, pref- 

40 erably about 15-25 contiguous amino acid residues of the full-length molecule, and most preferably about 20-50 or 
more contiguous amino acid residues of full-length IGF-I. The term "IGF-I analog" also captures peptides having one 
or more peptide mimics ("peptoids"), such as those described in International Publication No. WO 91/04282. Several 
IGF-I analogs and fragments are known in the art and include those described in, for example, Proc. Natl. Acad. Sci. 
USA (1986) 83:4904-4907; Biochem. Biophys. Res. Commun. (1987) 149:398-404; J. Biol. Chem. (1988) 263: 

45 6233-6239; Biochem. Biophys. Res. Commun. (1989) 165:766-771; Forsberg etal. (1990) Biochem. J. 271:357-363; 
U.S. Patent Nos. 4,876,242 and 5,077,276; International Publication No. WO 87/01 038 and WO 89/05822. Represent- 
ative analogs include one with a deletion of Glu-3 of the mature molecule, analogs with up to five amino acids truncated 
from the N-terminus, an analog with a truncation of the first three N-terminal amino acids and an analog including the 
first 1 7 amino acids of the B chaim of human insulin in place of the first 1 6 amino acids of human IGF-I. 

so [0054] The nucleotide sequence encoding the mature IGF protein of the present invention may be genomic, cDNA, 
or synthetic DNA. The genes encoding the native forms of IGF have been sequenced, and several variants are well 
known in the art. IGF-I and variants thereof can be produced in any number of ways that are well known in the art. For 
example, the IGF-I polypeptides can be isolated directly from blood, such as from serum or plasma, by known methods. 
See, for example, U.S. Patent No. 4,769,361 ; Svoboda etal. (1980) Biochemistry 1 9:790-797; Cornell and Boughdady 

55 (1982) Prep. Biochem. 12:57 and (1984) Prep. Biochem. 14:123. Alternatively, IGF-I can be synthesized chemically, 
by any of several techniques that are known to those skilled in the art. See, for example, Stewart and Young (1984) 
Solid Phase Peptide Synthesis (Pierce Chemical Company, Rockford, Illinois) and Barany and Merrifield (1980) The 
Peptides: Analysis, Synthesis, Biology (eds. Gross and Meienhofer) pp. 3-254, Vol. 2 (Academic Press, New York), for 
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solid phase peptide synthesis techniques; and Bodansky (1 984) Principles of Peptide Synthesis (Springer-Verlag, Ber- 
lin) and Gross and Meienhofer, eds. (1980) The Peptides: Analysis, Synthesis, Biology, Vol. 1, for classical solution 
synthesis. The IGF-I polypeptides of the present invention can also be chemically prepared by the method of simulta- 
neous multiple peptide synthesis. See, for example, Houghten (1985) Proc. Natl. Acad, Sci. USA 82:5131-5135; U.S. 
5 Patent No. 4,631,211. 

[0055] In this preferred embodiment comprising mature IGF-I, the C-terminal end of the truncated a-factor secretion 
leader peptide sequence and the N-terminal end of the native C-terminal propeptide sequence will terminate in a 
preferred processing site, preferably a dibasic processing site that is specific for the KEX2 endopeptidase of S. cere- 
visiae. The dipeptides can be any combination of the basic residues Lys and Arg, more preferably a Lys-Arg dipeptide. 

w [0056] The nucleotide sequences of the present invention are useful for producing biologically active mature heter- 
ologous proteins of interest in a yeast host cell when operably linked to a yeast promoter. In this manner, the nucleotide 
sequences encoding the hybrid precursor polypeptides of the invention are provided in expression cassettes for intro- 
duction into a yeast host cell. These expression cassettes will comprise a transcriptional initiation region linked to the 
nucleotide sequence encoding the hybrid precursor polypeptide. Such an expression cassette is provided with a plu- 

15 rality of restriction sites for insertion of the nucleotide sequence to be under the transcriptional regulation of the regu- 
latory regions. The expression cassette may additionally contain selectable marker genes. 

[0057] Such an expression cassette comprises in the 5' to 3' direction and operably linked a yeast-recognized tran- 
scription and translation initiation region, a nucleotide coding sequence for the hybrid precursor polypeptide comprising 
the sequence for the mature protein of interest, and a yeast-recognized transcription and translation termination region. 

20 By "operably linked" is intended expression of the coding sequence for the hybrid precursor polypeptide is under the 
regulatory control of the yeast-recognized transcription and translation initiation and termination regions. 
[0058] By "yeast-recognized transcription and translation initiation and termination regions" is intended regulatory 
regions that flank a coding sequence, in this case the nucleotide sequence encoding the hybrid polypeptide sequence, 
and control transcription and translation of the coding sequence in a yeast. These regulatory regions must be functional 

25 in the yeast host. The transcription initiation region, the yeast promoter, provides a binding site for RNA polymerase 
to initiate downstream (3') translation of the coding sequence. The promoter may be a constitutive or inducible promoter, 
and may be native or analogous or foreign or heterologous to the specific yeast host. Additionally, the promoter may 
be the natural sequence or alternatively a synthetic sequence. By foreign is intended that the transcription initiation 
region is not found in the native yeast of interest into which the transcription initiation region is introduced. 

30 [0059] Suitable native yeast promoters include, but are not limited to the wild-type a-factor promoter, as well as other 
yeast promoters. Preferably the promoter is selected from the list including promoters for the glycolytic enzymes phos- 
phoglucoisomerase, phosphofructokinase, phosphotrioseisomerase, phosphoglucomutase, enolase, pyruvate kinase 
(PyK), glyceraldehyde-3-phosphate dehydrogenase (GAP or GAPDH), alcohol dehydrogenase (ADH) (EPO Publica- 
tion No. 284,044). See, for example, EPO Publication Nos. 120,551 and 164,556. 

35 [0060] Synthetic hybrid promoters consisting of-the upstream activator sequence of one yeast promoter, which allows 
for inducible expression, and the transcription activation region of another yeast promoter also serve as functional 
promoters in a yeast host. Examples of hybrid promoters include ADH/GAP, where the inducible region of the ADH 
promoter is combined with the activation region of the GAP promoter (U.S. Patent Nos. 4,876,197 and 4,880,734). 
Other hybrid promoters using upstream activator sequences of either the ADH2, GAL4, GAL1 0, or PH05 genes com- 

40 bined with the transcriptional activation region of a glycolytic enzyme such as GAP or PyK are available in the art (EPO 
Publication No. 164,556). More preferably the yeast promoter is the inducible ADH/GAP hybrid promoter. 
[0061] Yeast-recognized promoters also include naturally occurring non-yeast promoters that bind yeast RNA 
polymerase and initiate translation of the coding sequence. Such promoters are available in the art . See, for example, 
Cohen etal. (1980) Proc. Natl. Acad. Sci. USA 77:1078; Mercereau-Puigalon et ai (1980) Gene 11:163; Panthier et 

45 al. (1 980) Curr. Genet. 2:1 09); Henikoff etal. (1 981 ) Nature 283:835; and Hollenberg etal. (1 981 ) Curr. Topics Microbiol. 
Immunol. 96:11 9. 

[0062] The termination regulatory region of the expression cassette may be native with the transcription initiation 
region, or may be derived from another source, providing that it is recognized by the yeast host. The termination regions 
may be those of the native a-factor transcription termination sequence, or another yeast-recognized termination se- 
50 quence, such as those for the glycolytic enzymes mentioned above. More preferably the transcription terminator is the 
A>7af-a(a-factor) transcription terminator described in U.S. Patent No. 4,870,008. 

[0063] The nucleotide sequences encoding the hybrid precursor polypeptides of the invention are provided in ex- 
pression cassettes for expression in a yeast host. The cassette will include 5' and 3' regulatory sequences operably 
linked to the nucleotide sequence encoding the hybrid precursor polypeptide of interest. The cassette may also contain 
55 at least one additional nucleotide sequence of interest to be cotransformed into the yeast host. Alternatively, the ad- 
ditional nucleotide sequences can be provided on another expression cassette. Where appropriate, the nucleotide 
sequence encoding the hybrid precursor polypeptide and any additional nucleotide sequences of interest may be op- 
timized for increased expression in the transformed yeast. That is, these nucleotide sequences can be synthesized 
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using yeast-preferred codons for improved expression. Methods are available in the art for synthesizing yeast-preferred 
nucleotide sequences of interest (see, for example. U.S. Patent Nos. 5,219,759 and 5,602,034). 
[0064] Additional sequence modifications are known to enhance expression of nucleotide coding sequences in a 
cellular host. These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice 
5 site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene 
expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated 
by reference to known genes expressed in the host cell. When possible, the nucleotide coding sequence is modified 
to avoid predicted hairpin secondary mRNA structures. 

[0065] In preparing the expression cassette, the various nucleotide sequence fragments may be manipulated, so as 
10 to provide for the sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this 
end, adapters or linkers may be employed to join the nucleotide fragments or other manipulations may be involved to 
provide for convenient restriction sites, removal of superfluous nucleotides, removal of restriction sites, or the like. For 
this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transver- 
sions, may be involved. See particularly Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (Cold Spring 
15 Harbor, New York). 

[0066] The expression cassettes of the present invention can be ligated into a replicon {e.g., plasmid, cosmid, virus, 
mini-chromosome), thus forming an expression vectorthat is capable of autonomous DNA replication in vivo. Preferably 
the replicon will be a plasmid. Such a plasmid expression vector will be maintained in one or more replication systems, 
preferably two replications systems, that allow for stable maintenance within a yeast host cell for expression purposes, 
20 and within a prokaryotic host for cloning purposes. Examples of such yeast-bacteria shuttle vectors include Yep24 
(Botstein et al. (1979) Gene 8:17-24; pCI/l (Brake et al. (1984) Proc. Natl. Acad. Sci. USA 81 :4642-4646), and Yrp17 
(Stnichomb et al. (1982) J. Moi. Biol. 158:157). 

[0067] Additionally, a plasmid expression vector may be a high or low copy number plasmid, the copy number gen- 
erally ranging from about 1 to about 200. With high copy number yeast vectors, there will generally be at least 10, 

25 preferably at least 20, and usually not exceeding about 1 50 copies in a single host. Depending upon the heterologous 
protein selected, either a high or low copy number vector may be desirable, depending upon the effect of the vector 
and the foreign protein on the host. See, for example, Brake et al. (1984) Proc. Natl. Acad. Sci. USA 81 :4642-4646, 
DNA constructs of the present invention can also be integrated into the yeast genome by an integrating vector. Exam- 
ples of such vectors are known in the art. See, for example, Botstein et al. (1979) Gene 8:17-24. 

30 [0068] The host chosen for expression of the heterologous proteins of the invention will preferably be a yeast. By 
"yeast" is intended ascosporogenous yeasts (Endomycetales), basidiosporogenous yeasts, and yeast belonging to 
the Fungi Imperfecti (Blastomycetes). The ascosporogenous yeasts are divided into two families, Spermophthoraceae 
and Saccharomycetaceae. The later is comprised of four subfamilies, Schizosaccharomycoideae (e.g., genus 
Schizosaccharomyces), Nadsonioideae, Lipomycoideae, and Saccharomycoideae (e.g., genera Pichia, Kluyveromy- 

35 ces, and Saccharomyces). The basidiosporogenous yeasts include the genera Leucosporidium, Rhodosporidium, 
Sporidiobolus, Filobasidium, and Filobasidiella. Yeast belonging to the Fungi Imperfecti are divided into two families, 
Sporobolomycetaceae {e.g., genera Sporobolomyces, Bullera) and Cryptococcaceae (e.g., genus Candida). Of par- 
ticular interest to the present invention are species within the genera Pichia, Kluyveromyces, Saccharomyces, 
Schizosaccharomyces, and Candida. Of particular interest are the Saccharomyces species S. cerevisiae, S. carlsber- 

40 gensis, S. diastaticus, S. douglasii, S. kluyveri, S. norbensis, and S. oviformis. Species of particular interest in the genus 
Kluyveromyces include K. lactis. Since the classification of yeast may change in the future, for the purposes of this 
invention, yeast shall be defined as described in Skinner et al., eds. 1980) Biology and Activities of Yeast (Soc. App. 
Bacterid. Symp. Series No. 9). In addition to the foregoing, those of ordinary skill in the art are presumably familiar 
with the biology of yeast and the manipulation of yeast genetics. See, for example, Bacila et al., eds. (1 978) Biochemistry 

45 and Genetics of Yeast; Rose and Harrison, eds. (1 987) The Yeasts (2 nd ed.); Strathern etai, eds. (1 981 ) The Molecular 
Biology of the Yeast Saccharomyces. 

[0069] The selection of suitable yeast and other microorganism hosts for the practice of the present invention is 
within the skill of the art. When selecting yeast hosts for expression, suitable hosts may include those shown to have, 
inter alia, good secretion capacity, low proteolytic activity, and overall vigor. Yeast and other microorganisms are gen- 
50 erally available from a variety of sources, including the Yeast Genetic Stock Center, Department of Biophysics and 
Medical Physics, University of California, Berkeley, California; and the American Type Culture Collection, Rockville, 
Maryland. 

[0070] Methods of introducing exogenous DNA into yeast hosts are well known in the art. There is a wide variety of 
ways to transform yeast. For example, spheroplast transformation is taught by Hinnen et al. (1978) Proc. Natl. Acad. 
55 Sci. USA 75:1919-1933 and Stinchcomb et al., EPO Publication No. 45,573; herein incorporated by reference. Trans- 
formants are grown in an appropriate nutrient medium, and, where appropriate, maintained under selective pressure 
to insure retention of endogenous DNA. Where expression is inducible, growth can be permitted of the yeast host to 
yield a high density of cells, and then expression is induced. The secreted, mature heterologous protein can be har- 
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vested by any conventional means, and purified by chromatography, electrophoresis, dialysis, solvent-solvent extrac- 
tion, and the like. 

[0071] The following examples are offered by way of illustration and not by way of limitation. 
5 EXAMPLES 

[0072] The following examples further describe the construction of an expression vector comprising the nucleotide 
sequence encoding mature human PDGF-B in accordance with the disclosed invention. Examples demonstrating the 
use of this expression vector to produce biologically active mature PDGF-BB in a yeast host are also provided. 
10 [0073] Additional examples describe an expression vector comprising the nucleotide sequence encoding mature 
human IGF-I in accordance with the disclosed invention and demonstrate the use of this expression vector to produce 
biologically active mature IGF-I in a yeast host. 

Example 1 : Plasmid Vector pAB24 

15 

[0074] The vector selected for expressing rhPDGF-BB, pAB24, is a yeast-bacteria shuttle vector. The plasmid is a 
chimera of sequences from pBR322, derived from several naturally occurring bacterial plasmids, and sequences of 
the endogenous S. cerevisiae 2-\x plasmid (Broach (1981) in Molecular Biology of the Yeast Saccharomyces (Cold 
Spring Harbor Press, New York), 1 :445-470). It also encodes genes enabling selection in both E. coil and S. cerevisiae 

20 hosts. The pBR322 part of pAB24 includes the ampicillin resistance (Ap r )-conferring gene encoding (^-lactamase, as 
well as a gene conferring tetracycline resistance (Tc r ). These genes allow transformation of competent E. coli and 
selection of plasmid-containing bacteria. A unique BamHI cloning site, present in the gene encoding tetracycline re- 
sistance, is the site utilized for insertion of an expression cassette. The pBR322 portion of the vector also includes a 
ColE1-like replication origin enabling replication in E. coli. Two S. cerevisiae genes derived from YEp24 (Botstein et 

25 al. (1 979) Gene 8:1 7-24), URA3 and Ieu2d, enable selection in yeast host strains lacking either or both of these genes. 
The latter gene, Ieu2d, lacks a portion of the 5'-untranslated promoter region and requires high plasmid copy number 
for growth in leucine-deficient medium. This is necessary to achieve sufficient LEU2 protein expression for comple- 
mentation of yeast strains lacking LEU2 (Erhart and Hollenberg (1983) J. Bacteriol. 156:625-635). The 2-|u sequences 
of pAB24 confer replication and partitioning of the expression plasmid in S. cerevisiae. Figure 1 shows a schematic 

30 map of plasmid pAB24 with key restriction sites and genetic elements. A description of the construction of pAB24 can 
be found in the European Patent Application publication EPO 0324 274 B1. 

[0075] Three expression plasmids containing the PDGF-B gene, pYAGL7PB, pYL7PPB (also known as 
pYAGL7PPB), and PYJST400, were used to produce PDGF-BB in a yeast host. All of these expression vectors utilize 
pAB24 as the plasmid into which the expression cassette comprising the PDGF-B gene was inserted. 

35 

Example 2: Construction of Expression Plasmid pYAGL7PB 
General Description 

40 [0076] Plasmid pYAGL7PB includes an expression cassette with the following features. Transcription is mediated 
by the inducible, hybrid yeast promoter ADH/GAP. This promoter includes ADR2 transcription factor responsive se- 
quences from the S. cerevisiae ADH2 gene (Beier and Young (1982) Nature 300:724-728) and promoter sequences 
from the S. cerevisiae gene TDH3, encoding the glycolytic enzyme glyceraldehyde-3-phosphate dehydrogenase 
(GAP). The ADR2 transcription factor responsive sequences confer inducible gene transcription upon downstream 

45 sequences. Induction is achieved by glucose depletion in the growth medium. Termination of transcription is mediated 
by the terminator derived from the S. cerevisiae mating factor type alpha (Mata) gene (Brake et al. (1 984) Proc. Natl. 
Acad. Sci. USA 81 :4642-4646). 

[0077] The cassette further includes an open reading frame encoding a truncated Mata sequence fused to a se- 
quence encoding the human PDGF-B gene. The truncated cc-factor leader mediates secretion of in-frame protein fu- 

50 sions. It is a derivative of S. cerevisiae a-factor leader, the product of the Mata gene (Kurjan and Herskowitz (1982) 
Cell 30:933-943). A dibasic amino acid processing site is present at the truncated a-factor leader/PDGF-B junction to 
facilitate production of correctly processed rhPDGF-BB polypeptide by yeast. Figure 2 shows a map of the pYAGL7PB 
expression cassette highlighting these features and the restriction enzyme sites relevant to the construction of this 
expression cassette. The nucleotide sequence and predicted amino acid sequence of the open reading frame encoding 

55 the truncated a-factor leader-PDGF-B primary translation product are given in SEQ ID NO: I and SEQ ID NO: 2, re- 
spectively. 



11 



EP 0 946 736 B1 

Sequential Construction of pYAGL7PB 

[0078] Following is a description of the sequential steps taken to construct this expression vector. 

5 Construction of PDGF-B Synthetic Gene and Cloning into a Yeast Expression Vector 

[0079] The synthetic gene encoding the partial dibasic processing site and rhPDGF-B (SEQ ID NOs: 3-4) was made 
from 17 overlapping oligonucleotides (SEQ ID NOs: 5-21) as described in Urdea et. al. (Proc. Natl. Acad Sci. USA 80 
(1983):7461-7465). Ligation of the fragments results in an Xbal-Sall fragment, which was subsequently inserted into 

10 Xbal-Sall cut pPAG/ocF vector. 

[0080] Plasmid pPAG/ccF is a pBR322 derivative with an expression cassette delineated by BamHI sites. The ex- 
pression cassette includes the ADH/GAP hybrid promoter, as well as the open reading frame encoding the yeast (DC- 
factor leader (BamHI-Xbal), an Xbal-Sall gene fragment, and the Mara (a-f actor) transcription terminator (Sall-BamHI). 
Substitution of an Xbal-Sall gene fragment (in-frame) capable of heterologous protein expression into this plasmid 

15 allows the expression and secretion of the heterologous protein. The isolation of the yeast glyceraldehyde-3-phosphate 
(GAP) gene promoter, the origin of the ADH2 component of the promoter, and the construction of a hybrid ADH/GAP 
promoter are described in U.S. Patent Nos. 4,876, 1 97 and 4,880,734. The isolation of the yeast oc-factor gene including 
the transcription terminator is described in U.S. Patent No. 4,870,008. 

[0081] Upon dideoxy sequencing, the synthetic gene sequence was found to have a single base pair mutation, which 
20 was repaired by standard procedures. Plasmid pPAGBB-1 is the piasmid derived from pPAG/ocF that contains the 
correct synthetic PDGF-B (Xbal-Sall) gene. 

Construction of Synthetic Truncated a-Factor Leader Gene with Dibasic Processing Site 

25 [0082] The truncated oc-factor leader mediates secretion of in-frame hybrid polypeptides. It is a derivative of S. cer- 
evisiae oc-factor leader, the secretion leader for mating factor type alpha, the product of the Mata gene (Kurjan and 
Herskowitz (1 982) Cell 30:933-943), and consists of the first 35 amino acids of the native leader. The construction and 
use of a truncated oc-factor leader gene to mediate secretion is described in EPO Publication No. 0324 274 B1 . Synthetic 
oligonucleotides encoding a comparable, partial (amino acids 8-35) truncated oc-factor leader (L7) and part of the 

30 dibasic processing site were made from oligonucleotides given in SEQ. ID NO: 22 and and when assembled with the 
complementary strand shown in SEQ ID NO: 23 resulted in a Pstl-Bglll fragment with a 3' -ACGTC- and a 5'-CTAG- 
overhang to allow for convenient ligation into the expression cassette. 

Construction of pAGL7PB 

35 

[0083] The purpose of this construction was the substitution of the synthetic, partial truncated oc-factor leader Ps- 
tl-Bglll gene fragment described above for most of the full-length oc-factor leader in the PDGF-B expression cassette 
of pPAGBB-1 . A 1 .9 kb Pst I fragment including pBR322 sequences, the ADH/GAP hybrid promoter (marked at the 5' 
end by a BamHI site) and the 5' partial oc-factor leader gene sequence (encoding the first seven amino acids of the 
40 native oc-factor leader) was isolated from pPAGBB-1. It was ligated to kinased, annealed synthetic oligonucleotides 
1 .49/3°. 40. Following digestion with BamHI, a partial expression cassette 5' fragment was obtained including sequenc- 
es for the ADH/GAP hybrid promoter and the 5' portion of the truncated oc-factor leader. 

[0084] Similarly, a Bglll fragment containing the PDGF-B synthetic gene, the oc-factor terminator (marked at the 3' 
end by a BamHI site) and pBR322 sequences was isolated from pPAGBB-1 . It was ligated to kinased, annealed syn- 

45 thetic oligonucleotides 2. 32/4°. 50. Following digestion with BamHI, a partial expression cassette 3' fragment was ob- 
tained including sequences for the 3' portion of the truncated oc-factor leader, PDGF-B, and the oc-factor leader tran- 
scription terminator. The complete PDGF-B expression cassette was obtained following ligation of the 5' and 3' partial 
expression cassette gene fragments and digestion with BamHI. The BamHI expression cassette was cloned into the 
BamHI site of a pBR322-derived vector (pBRAEco-Sal) to give plasmid pAGL7PB. A map of the PDGF-B expression 

50 cassette in this plasmid is shown in Figure 2. 

Construction of pYAGL7PB 

[0085] The PDGF-B expression cassette of pAGL7PB was isolated by BamHI digestion and inserted into the BamHI 
55 site of the yeast-bacteria shuttle vector pAB24 described above. A yeast expression plasmid, pYAGL7PB, was isolated. 
A plasmid map of pYAGL7PB is shown in Figure 3. The nucleotide sequence of the complete expression cassette and 
the predicted amino acid sequence of the open reading frame (ORF) encoding the truncated oc-factor leader-PDGF-B 
primary translation product are given in SEQ ID NO; 24 and SEQ ID NO: 25, respectively. 
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Expression Strain Identification: MB2-1(pYAGL7PB) 

[0086] Expression plasmid pYAGL7PB was transformed into S. cerevisiae MB2-1 by standard procedures and pro- 
totrophic uracil colonies were selected. Individual colonies from independent transformants were screened for expres- 
5 sion following inoculation of single colonies into medium that selects for leucine prototrophs. The medium also is high 
in glucose to keep expression of sequences under ADR2 regulation (including the PDGF-B gene) repressed. Cultures 
were subsequently diluted and grown to confluence in low glucose medium lacking uracil. Cell-free culture supernatants 
were prepared and assayed for PDGF-BB by immunoactivity (ELISA) and by mitogenic activity on 3T3 cells. A high 
PDGF-BB expressing colony, MB2-1 (pYAGL7PB #5), was identified. 

10 

Example 3: Construction of Expression Plasmid pYL7PPB 
General Description 

15 [0087] Plasmid pYL7PPB (also known as pYAGL7PPB) includes an expression cassette with the following features. 
Transcription initiation and termination is mediated by the inducible, hybrid yeast promoter ADH/GAP and the Mata 
transcriptional terminator described above. The gene further includes an open reading frame encoding a truncated 
yeast oc-factor leader to mediate secretion of rhPDGF-BB. The propeptide sequence included in the expression con- 
struct is only the native N-terminal propeptide sequence; the native C-terminal propeptide sequence was not included 

20 in the construct. Inclusion of the N-terminal propeptide sequence resulted in enhanced expression of rhPDGF-BB, 
presumably because of improved folding. Dibasic processing sites at the truncated oc-factor leader/N-terminal propep- 
tide and N-terminal propeptide/PDGF-B junctions were included to facilitate production of correctly processed rhP- 
DGF-BB polypeptide by yeast. Figure 4 shows a map of the pYL7PPB expression cassette highlighting these features 
and the sites relevant to the construction of this expression cassette. The nucleotide sequence and predicted amino 

25 acid sequence of the open reading frame encoding the truncated a-factor leader-proPDGF-B primary translation prod- 
uct are shown in SEQ ID NO: 26 and SEQ ID NO: 27, respectively. 

Sequential Construction of pYL7PPB 

30 Source of rhPDGF-B cDNA 

[0088] A cloned cDNA encoding native human preproPDGF-B, XhPDGFb-17, was provided by collaborators Arne 
Ostman and Carl Heldin. Isolation of the cDNA encoding hPDGF-B was achieved using a cDNA library prepared from 
RNA isolated from a human clonal glioma cell line, U-343 MGa CI 2 (Ostman et ai (1988) J. Biol. Chem. 263: 
35 16202-16208). 

Construction of pSV7d-PDGF A103-B1 

[0089] Plasmid pSV7d-PDGF A103-B1 was the source of the N-terminal propeptide-PDGF-B cDNA. The plasmid 

40 was constructed as described below. 

[0090] The 3 kb Eco R1 PDGF-B cDNA insert from clone XhPDGFb-1 7 was excised and cloned into the unique Eco 
R1 site of the mammalian expression vector pSV7d to give plasmid phPDGFp-1 (also known as pSV7d-PDGF-B1 ). 
[0091] A mammalian plasmid, pSV7d-PDGF A1 03-|31 , for the coexpression of both PDGF-A and -B chains from their 
respective cDNAs, was constructed as follows. Plasmid phPDGFp-1 was digested with Pstl under conditions favoring 

45 cleavage at one of the two plasmid Pstl sites (desired single cleavage at site in ampicillin resistance gene of the pSV7d 
vector backbone) and ligated with Pstl-digested pSV7d-PDGF-A103(D1). This latter plasmid is strictly analogous to 
the PDGF-B mammalian expression plasmid phPDGF|3-1 , except that it includes cDNA encoding the long, 21 1 amino 
acid form of the PDGF-A chain rather than the PDGF-B chain cDNA. This plasmid contains a single Pstl site in the 
ampicillin resistance gene of the pSV7d vector backbone. 

50 [0092] Following transformation, bacterial colonies were screened for the presence of both PDGF-B and PDGF-A 
cDNA sequences with the respective or appropriately labeled EcoRI cDNA probes. Colonies positive for both PDGF-B 
and -A chain sequences were further screened by EcoRI digestion of plasmid DNA, and plasmid pSV7d-PDGF 
A103-B1, having a predicted EcoRI pattern, was identified. 

55 Mutagenesis of hPDGF-B cDNA 

[0093] The PDGF-B cDNA was mutagenized: (1) to introduce a Sacl site enabling introduction of the truncated ce- 
faclor secretion leader, and (2) to change the hPDGF-B cDNA sequence encoding dibasic amino acids Arg-Arg to 
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encode Lys-Arg. This dibasic combination is more efficiently cleaved than Arg-Arg by the yeast dibasic processing 
enzyme KEX2 endopeptidase. The template for mutagenesis was prepared as follows. 

[0094] The ~3kb EcoRI hPDGF-B cDNA was isolated from pSV7d-PDGF A1 03-B1 and inserted into the EcoRI site 
of pBR322 to give plasmid pPPB/6. The nucleotide sequence of the 2.7 kb Pstl-EcoRI cDNA fragment was verified. 
5 The 0.9 kb Pstl-Ncol cDNA fragment was inserted into the Pstl-Ncol sites of M13 and the nucleotide sequence of the 
insert verified. A partial nucleotide sequence and the predicted amino acid sequence of the PDGF-B cDNA are given 
in SEQ ID NO: 28 and SEQ ID NO: 29, respectively. 

[0095] A double mutagenesis of M13 Pstl-Ncol PDGF-B cDNA fragment was performed by standard methods using 
the following primers. Primer 1 (SEQ ID NO: 30) introduces a Sacl site; primer 2 (SEQ ID NO: 31) converts Arg-Arg 
10 to Lys-Arg at the propeptide/PDGF-B junction. Additional mutations are introduced to facilitate detection of mutagenized 
sequences by hybridization with the labeled primer. No changes resulted in the primary amino acid sequence by primer 
1 mutagenesis; only the Arg^>Lys amino acid change resulted from primer 2 mutagenesis. Mutant hPDGF-B inserts 
were detected by hybridization with both primer 1 and 2 radiolabeled probes. DNA sequence was verified, and RF 
(double-stranded) plasmid was prepared. 

15 

Construction of pL7PPB (pAGL7PPB) 

[0096] Essentially, the steps described below result in the substitution of the Xhol-Sall portion of the PDGF-B ex- 
pression cassette in pAGL7PB encoding the C-terminal portion of the truncated oc-factor leader, the Lys-Arg dibasic 
20 processing site and PDGF-B (Figure 2) with an Xhol-Sall gene fragment encoding the C-terminal portion of the trun- 
cated a-factor leader, a Lys-Arg dibasic processing site, the PDGF-B N-terminal propeptide, a Lys-Arg dibasic process- 
ing site, and PDGF-B. The sequences encoding the N-terminal PDGF-B propeptide and PDGF-B were derived from 
cDNA as described above. A map of the resulting expression cassette is shown in Figure 4. 

[0097] A 447 bp Sacl-Sphl fragment including most of the proPDGF-B gene was isolated from the M13 RF containing 
25 the modified preproPDGF-B cDNA. Synthetic oligonucleotides, including sequences encoding the C-terminal part of 
truncated a factor leader, a Lys-Arg dibasic processing site, and the N-terminal portion of the PDGF-B propeptide (SEQ 
I D NOs: 32-33), were joined to give a fragment with a 3' Sacl overhang. Synthetic oligonucleotides, Sph-Sal l/Sph-Sal 
II, including sequences encoding the last 14 amino acids of PDGF-B and stop codons were joined to give a Sphl-Sall 
fragment (SEQ ID NOs: 34-35). These two sets of annealed oligonucleotides were ligated to the 447 bp Sacl-Sphl 
30 proPDGF gene fragment. This resulted in a gene fragment including sequences encoding the C-terminal part of trun- 
cated a-factor leader, a Lys-Arg dibasic processing site and proPDGF-B. 

[0098] Synthetic oligonucleotides, including sequences encoding the middle amino acids of the truncated a-factor 
leader were joined resulting in a fragment with a 5' Xhol overhang (SEQ ID NOs: 32-33). This annealed oligonucleotide 
was ligated with pAGL7PB that had been cut with Xhol (unique site in pAGL7PB plasmid that is in the expression 
35 cassette, see Figure 2). Following oligonucleotide annealing, the modified plasmid was digested with Sail resulting in 
loss ofthe pAGL7PB Xhol-Sall fragment and resulting in a vector/gene fragment. 

[0099] The final step in the construction of the PDGF-B expression cassette was the ligation ofthe gene fragment 
into the vector/gene fragment to give plasmid pL7PPB (pAGL7PPB), as shown in Figure 5. The Pstl-BamH1 insert 
fragment was isolated and nucleotide sequencing confirmed that the desired construction had been obtained. A map 
40 of the PDGF-B expression cassette in pL7PPB is shown in Figure 4. 

Construction of pYL7PPB (pYAGL7PPB) 

[0100] The PDGF-B expression cassette of pL7PPB was isolated following BamHI digestion and inserted into the 
45 BamHI site of the yeast shuttle vector pAB24, described above, resulting in yeast expression plasmid pYL7PPB. A 
map of pYL7PPB is shown in Figure 6. The nucleotide sequence of the complete expression cassette and the predicted 
amino acid sequence of the open reading frame (ORF) encoding truncated a-factor leader-Lys-Arg-proPDGF-B are 
given in SEQ ID NO: 36 and SEQ ID NO: 37, respectively. The complete nucleotide sequence of yeast expression 
plasmid pYL7PPB has been determined. 

50 

Expression Strain Identification: MB2-1(pYL7PPB) 

[0101] Expression plasmid pYL7PPB was transformed into S. cerevislae MB2-1 by standard procedures and plasmid- 
harboring, uracil prototrophs were selected as isolated colonies. Individual colonies from independent transformants 
55 were screened for expression following inoculation of isolated colonies into growth medium that selects for leucine 
prototrophs. The medium also is high in glucose to keep expression of sequences under ADR2 regulation (including 
the PDGF-B gene) repressed. Cultures were subsequently diluted and grown to confluence in low glucose, selective 
growth medium lacking uracil. Cell-free supernatants were assayed for PDGF-BB by immunoactivity (ELISA) and by 
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mitogenic activity on 3T3 cells. Frozen stocks were prepared of several transformants exhibiting consistently high levels 
of expression. Following repeated testing, the transformant exhibiting, on average, the highest expression of PDGF-BB, 
MB2-1 (pYL7PPB #22) was selected. 

Example 4: Expression Plasmid pYJST400 

[0102] The Lys-Arg dibasic processing site between the cc-factor leader sequence and the N-terminal propeptide 
was eliminated from expression plasmid pYL7PPB by in vitro mutagenesis to construct expression plasmid pYJST400. 
Thus pYJST400 has a single dibasic processing site, which resides at the propeptide/PDGF-B junction. Elimination of 
this first processing site was done to determine its relative effect on secretion of rh PDGF-BB from yeast as mediated 
by the cc-factor leader peptide. 

Example 5: Expression of Recombinant Human PDGF-BB 

[0103] Recombinant human PDGF-BB is produced by a strain of the yeast, Saccharomyces cerevisiae, genetically 
modified with a multicopy yeast expression plasmid that includes a gene encoding human PDGF-B. The preferred S. 
cerevisiae strain MB2-1 has the genotype: Mata, ura3A, leu2-3, Ieu2-112, his3-11, his3-15, pep4A, [cir°]. It is auxo- 
trophic for uracil, leucine, and histidine, requiring these nutritional supplements when grown in minimal medium. MB2-1 
does not contain an endogenous 2-\x plasmid, which tends to interfere with the stability of the introduced plasmids and 
encourages recombination between endogenous and introduced plasmids. The strain does not express functional 
protease A, the product of the PEP4 gene, which interferes with the production of heterologous proteins. MB2-1 was 
designed to impart these favorable characteristics, which include selection for high expression of heterologous proteins. 
[0104] Yeast expression plasmids pYAGL7PB, pYL7PPB, and pYJST400 were transformed into yeast strain MB2-1 
as described by Hinnen el al. (1 978) Proc. Natl. Acad. Sci. USA 75: 1 929-1 933 and plated on ura-, 8% glucose, sorbitol 
plates. Transformants were grown in leu-, 8% glucose liquid medium for 24 hours and then plated onto leu-, 8% glucose 
sorbitol plates to get individual colonies. Individual colonies were picked and grown in 3 ml of leu-, 8% glucose medium 
for 24 hours at 30 C, and then inoculated (1 :50) into 1 liter of ura-, 1 % glucose media and grown for 75 hours at 30 C. 
Yeast culture medium was assayed for PDGF activity by the human foreskin fibroblast mitogen assay (see Example 
5 below). 

[0105] As shown in Table 1, inclusion of the sequence encoding the N-terminal propeptide resulted in a mean 3,4-fold 
increase in secretion of rhPDGF-BB as measured by bioactivity and by ELISA. Additionally, elimination of the Lys-Arg 
processing site at the leader/propeptide junction resulted in a 2.8-fold decrease in rhPDGF-BB secretion (Table 1). 
[0106] These results indicate that the presence of the native N-terminal propeptide enhances secretion of biologically 
active mature rhPDGF-BB when flanked by preferred processing sites that have been modified for improved recognition 
by a proteolytic enzyme of the yeast host cell. Thus, cleavage at the leader/propeptide junction, as well as at the 
propeptide/PDGF-B junction, apparently facilitates the proper folding and/or processing and/or transport of the 
pro-PDGF-B, resulting in enhanced secretion of mature rhPDGF-BB. 
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Example 6: Human Foreskin Fibroblast (HFF) 
Mitogen Assay for PDGF 

5 [0107] Human foreskin fibroblast stocks were stored frozen; freezing was at passage 13. Prior to use, HFF were 
thawed and then grown in T75 flasks until confluent, which usually occurred at 5-7 days. Growth medium contained 
Dulbecco's Modified Eagles Medium (DMEM), 20% fetal bovine serum (FBS), 1 mM sodium pyruvate, 300 jitg/ml L- 
glutamine, 100U/ml penicillin, and 100 jig/ml streptomycin. Cells were incubated at 37 C in humidified 7% C0 2 , 93% 
air atmosphere. At confluency, cells were passaged by rinsing the monolayer with phosphate buffered saline (PBS) 

10 lacking C 2- and Mg 2+ , dissociating them in trypsin containing EDTA, and diluting them with growth medium. Cells were 
passaged no more than 8 times after thawing. 

[0108] To assay for PDGF, HFFs were plated as follows. The cells were rinsed and dissociated with trypsin as above. 
The trypsinized cells were pelleted and resuspended to a concentration of 1 x 1 0 5 cells/ml in medium similar to growth 
medium, except that 5% FBS replaced 20% FBS; 100 jllI of suspension was dispensed into each well of a 96-well 

15 microtiter plate: and then the cells were incubated 5-6 days under the above described conditions. 

[0109] PDGF in the sample was determined by monitoring 3 H-thymidine incorporation into HFF DNA stimulated by 
PDGF. Samples were added to the wells containing HFF monolayers, and the assay plates incubated as above for 1 8 
hours. The HFF cultures were then pulsed with [Methyl- 3 H]thymidine (1 0 |uC/ml final concentration, 1 |uC/well) at 37 C 
under the above described incubation conditions for 8 hours. After incubation, the cells were rinsed with PBS and fixed. 

20 Fixing was by incubation with 5% trichloracetic acid (TCA) and then 1 00% methanol for 1 5 minutes, followed by drying 
in air. The cells were then solubilized with 0.3N NaOH and then counted in a liquid scintillation counter. 
Control samples were treated as the samples described above and were prepared as follows. For positive controls, 
PDGF, purchased from PDGF, Inc., was dissolved to a final concentration of 1 00 ng/ml in DMEM containing 1 0 mg/ml 
BSA. A standard curve was prepared; the first point was 10 ng/ml, the remaining points were 2-fold serial dilutions. 

25 Each dilution was tested in triplicate. Negative controls, which lacked both sample and control PDGF, were also run. 

Example 7: Expression Plasmids pYLUI 

[0110] Plasmid pYLUIGF24 includes an expression cassette with the hybrid yeast promoter ADH/GAP and Mata 
30 factor leader sequences fused to a sequence encoding the human IGF-I-A gene. This sequence was synthetically 
derived using yeast preferred codons. A dibasic amino acid processing site is present at the a-factor leader/IGF-l-A 
junction. The nucleotide sequence and predicted amino acid sequence of the open reading frame encoding a-factor 
leader/IGF-l-A primary translation product are given in SEQ ID NO: 38 and SEQ ID NO: 39, respectively. 
[0111] Plasmid pYLUIGF34 differs from pYLUTGF24 only in its open reading frame. This cassette includes an open 
35 reading frame encoding a full length Mata factor leader sequence fused to a sequence encoding the human IGF-I-A 
gene with its C-terminal prosequence. Dibasic amino acid processing sites are present at the a-factor leader/IGF-l-A 
and IGF-I-A/IGF-I-A prosequence junctions. The nucleotide sequence and predicted amino acid sequence of the open 
reading frame encoding a-factor leader-prolGF-l-A primary translation product are given in SEQ ID NO: 40 and SEQ 
ID NO: 41, respectively. 

40 [0112] Both of these plasmids were generated by inserting the respective expression cassette into the unique BamHI 
cloning site of the yeast shuttle vector pAB24 as described above. 

Example 8: Expression of Recombinant Human GF-I-A 

45 [01 13] Recombinant human IGF-I-A is produced by a strain of the yeast Saccaromyces cerevisiae, genetically mod- 
ified with a multicopy yeast expression plasmid that includes a gene encoding human IGF-I-A. Yeast expression plas- 
mids pYLUIGF24 and pYLUIGF34 were transformed into a yeast strain by procedures previously mentioned. 
[01 1 4] Western blot data indicated that properly processed IGF-I A protein was obtained with the prosequence, mod- 
ified KEX2 processing site, and a yeast secretion leader. 

50 [01 1 5] All publications and patent applications mentioned in the specification are indicative of the level of those skilled 
in the art to which this invention pertains. 

[0116] Although the foregoing invention has been described in some detail by way of illustration and example for 
purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within 
the scope of the appended claims. 

55 
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SEQUENCE LISTING 
[0117] 

5 (1) GENERAL INFORMATION: 

(i) APPLICANT: Tekamp-Olson, Patricia 

(ii) TITLE OF INVENTION: METHOD FOR EXPRESSION OF HETEROLOGOUS PROTEINS IN YEAST 

10 

(iii) NUMBER OF SEQUENCES: 41 

(iv) CORRESPONDENCE ADDRESS: 

15 (A) ADDRESSEE: Bell Seltzer IP Group of Alston & Bird, LLP 

(B) STREET: 3605 Glenwood Ave. Suite 310 

(C) CITY: Raleigh 

(D) STATE: NC 

(E) COUNTRY: US 
20 (F) ZIP: 27622 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
25 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

30 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

35 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Spruill, W. Murray 

(B) REGISTRATION NUMBER: 32, 943 

(C) REFERENCE/DOCKET NUMBER: 5784-4 

40 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 919 420 2202 

(B) TELEFAX: 919 881 3175 

45 

(2) INFORMATION FOR SEQ ID NO:1 : 
(i) SEQUENCE CHARACTERISTICS: 

50 (A) LENGTH: 444 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 (jj) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Chimeric nucleic acid" 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..441 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 

(D) OTHER INFORMATION: /function= "mediates secretion of proteins" 
/product= "yeast alpha factor leader peptide" 
/scandard_name= "alpha factor signal/leader 
sequence" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 112..441 

(D) OTHER INFORMATION: /product= "rhPDGF-B protein" /standard_name 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1 : 
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10 



40 



45 



ATG AGA TTT CCT TCA ATT TTT ACT GCA GTT TTA TTC GCA GCC TCG AGC 4 3 

Met Arg Phe ?ro Ser lis Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-37 -35 -3 0 -25 

GCA TTA GCT OCT CCA GTC .3 AC ACT ACA ACA GAA GAT GAA ACG GCA CAA 96 
Ala Lea Ala Ala Pro Val Asr. Thr Thr Thr Glu Asp Glu Thr Ala Gin 

■-2 0 - IS -10 

ATI CCG GCT AAA AGA TC7 TTG GGT TCT TTG ACT ATC GCT GAA CCA GCT 144 
lie Pro Ala Lys Arg Ser .Leu Gly Ser Leu Thr lie Ala Glu Pro Ala 
-5 1 5 10 



ATG ATC GCT GAA TGT AAG ACT AGA ACT GAA GTT TTC GAA ATC TCC AGA 192 
Met lie Ala Glu Cys Lys Thr Arg Thr Glu Val Phe Glu He Ser Arg 
15 15 20 25 

AGA TTG ATC GAC AGA ACT AAC GCT AAC TTC TTG GTT TGG CCA CCA TGT 24 0 

Arg Leu lie Asp Arg Thr Asn Ala Asr. Phe Leu Val Trp Pro Pro Cys 

JO 35 40 

20 

GTT GAA GTT CAA AGA TGT TCT GGT TGT TGT AAC AAC AGA AAC GTT CAA 28 8 

Val Glu Val Gin Arg Cys Ser Gly Cys Cys Asn Asn Arg Asn Val Gin 
45 50 55 

TGT AGA CCA ACT CAA GTT CAA TTG AGA CCA GTT CAA GTT AGA AAG ATC 3 35 

25 Cys Arg Pro Thr Gin Val Gin Leu Arg Pro Val Gin Val Arg Lys He 
SO 65 70 75 

GAA ATC GTT AGA AAG AAG CCA ATC TTC AAG AAG GCT ACT GTT ACT TTG 384 
Glu He Val Arg Lys Lys Pro lis Phe Lys Lys Ala Thr Val Thr Leu 
3 o 80 BS 90 

GAA GAC CAC TTG GCT TGT AAG TGT GAA ACT GTC GCC GCT GCC AGG CCA 432 

Glu Asp His Leu Ala Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro 

95 100 105 

35 



GTT ACT TAA TAG 444 
Val Thr * 
110 

(2) INFORMATION FOR SEQ ID NO:2: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 147 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

50 (jj) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



55 
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Met Arg Phe Pro 
-37 -35 

5 Ala Leu Ala Ala 

-20 

lis Pro Ala Lys 
-5 

10 Mer, Us Ala Glu 

1 5 

Arg Lsu lis Asp 
30 

VaL Glu Val Gin 
4 5 



Ser lie Phe Thr Ala Val 

-30 

Pro Val Asn Thr Thr Thr 
-15 

Arg Ser Lea Gly Ser Leu 
1 5 

Cyo .Lys Thr Arg Ihr Glu 
Arg Th.r A sr. Ala Asa "Phe 

Arg Cys Ser Gly Cys Cys 

.5 0 



Leu Phe Ala Ala Ser Ser 
-25 

Glu Asp Glu Thr Ala Gin 
-10 

Thr He Ala Glu Pro Ala 

10 

Val Phe Glu He Ser Arg 

25 

Leu Val Trp Pro Pro Cys 
4 0 

Asn Asn Arg Asn Val Gin 
5 5 



Cys Arg Pro Thr Gin Val Gin Leu Arg Pro Val Gin Val Arg Lys He 

SO 65 70 75 

20 

Glu He Val Arg Lys Lys Pro He Phe Lys Lys Ala Thr Val Thr Leu 

30 B5 90 

i 

Glu Asp His Leu Ala Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro 
25 95 100 105 

Val Thr * 
110 



30 (2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

40 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



CTCTAGATAA AAGATCTTTG GGTTCTTTGA CTATCGCTGA ACCAGCTATG ATCGCTGAAT £0 

50 GTAAGACTAG AACTGAAGTT TTCGAAATCT CCAGAAGATT GATCGACAGA ACTAACGCTA 120 

ACTTCTTGGT TTGGCCACCA TGTGTTGAAG TTCAAAGATG TTCTGGTTGT TGTAACAACA 18 0 

GAAACGTTCA ATGTAGACCA ACTCAAGTTC AATTGAGACC AGTTCAAGTT AGAAAGATCG 2*0 

55 

AAATCGTTAG AAAGAAGCCA ATCTTCAAGA AGGCTACTGT TACTTTGGAA GACCACTTGG 3 00 

CTTGTAAGTG TGAAACTGTT GCTGGTGCTA GACCAGTTAC TTAATAGCGT CG 3 52 
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(2) INFORMATION FOR SEQ ID NO:4: 
(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Complementing strand to the preceding SEQ ID NO:, listed to show the 
terminal overhangs produced upon assembly." 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

20 Cxi) SEQLTENCE DESCRIPTION: SEQ ID NO : 4 : 

TCGACGACGC TATTAAGTAA CTGGTCTAGC AG C AG C AA C A GTTTCACACT TACAAGCCAA 6 0 

GTGGTCTTCC AAAGTAACAG TAGCCTTCTT GAAGATTGGC TTCTTTCTAA CGATTTCGAT 12 0 

25 CTTTCTAACT TGAACTGGTC TCAATTGAAC TTGAGTTGGT CTACATTGAA CGTTTCTGTT 1BQ 

GTTACAACAA CCAGAACATC TTTGAACTTC AACACATGGT GGCCAAACCA AGAAGTTAGC 24 0 

GTTAGTTCTG TCGATCGAAT CTTCTGGAGA TTTCGAAAAC TTAGTTCTAG TCTTACATTC 300 

30 

AGCGATCATA GCTGGTTCAG CGATAGTCAA AGAACCCAAA GATCTTTTAT CT 352 



(2) INFORMATION FOR SEQ ID NO:5: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
45 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

50 

CTTCTAGATAA AAGATCTTTG GGTTCTTXGA CT ATCGCTG A A CCA 44 

(2) INFORMATION FOR SEQ ID NO:6: 

55 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 45 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo saciens 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
GCTATGATCG CTGAATGTAA GACTAGAACT GAAGTTTTCG AAA PC 4 5 

15 

(2) INFORMATION FOR SEQ ID NO:7: 
(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 
TCCAGAAGAT TGATCGACAG AACTAACGCT AACTTCTTGG TTTGG 45 

35 

(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

40 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

50 (A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



CCACCATGTG TTGAAGTTCA AAGATGT TCT GGTTGTTGTA ACAAC 45 

(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 bass pairs 

(B) IYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

AGAAACGTTC AATGTAGACC AACTCAAGTT CAATTGAGAC CAGTT 

(2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

CAAGTTAGAA AGATCGAAAT CGTTAGAAAG AAGCCAATCT TCAAG 

(2) INFORMATION FOR SEQ ID NO:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: 
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AAGGCTACTG TTACTTTGGA AGACCACTTG GCTTGTAAGT GTGA 44 



5 (2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



AACTGTTGCT GGTGCTAGAC CAGTTACTTA ATAGCGTCG 3 9 

25 

(2) INFORMATION FOR SEQ ID NO:13: 
(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 1 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
TCTATTTTCT AGAAACCC !8 

45 

(2) INFORMATION FOR SEQ ID NO:14: 

(i) SEQUENCE CHARACTERISTICS: 

50 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

5 

AAGAAACTGA 7AGCG ACT IG GTCGATACTA GCGACTTACA TTCTG 45 



(2) INFORMATION FOR SEQ ID NO:15: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

15 (C) STRAND EDNDSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

25 

ATCTTGACTT CAAAAGCTTT AGAGGTCTTC TAACTAGCTG TCTTG 4 5 



30 (2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

40 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 



ATTGCGATTG AAGAACCAAA CCGGTGG X AC ACAACTTCAA GTTTC 45 

50 

(2) INFORMATION FOR SEQ ID NO:17: 
(i) SEQUENCE CHARACTERISTICS 

55 (A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

TACAAGACCA. ACAACATTGT TGTCTTTGCA A G TTACAT C T GGTTG 

(2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

AGTTCAAGTT AACTCTGGTC AAGTTCAATC TTTCTAGCTT TAGCA. 

(2) INFORMATION FOR SEQ ID NO:19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 

ATCTTTCTTC GGTTAGAAGT TCTTCCGATG ACAATGAAAC CTTC 

(2) INFORMATION FOR SEQ ID NO:20: 
( i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 44 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

TGGTGAACCG AACATTCACA CTTTGACAAC GACGACGATC TGGT 4 4 

(2) INFORMATION FOR SEQ ID NO:21 : 
(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 



15 



30 



35 



40 



45 



50 



CAATGAATTA TCGCAGCAGC T 21 



(2) INFORMATION FOR SEQ ID NO:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Assembled synthetic oligonucleotides resulting in a truncated alpha factor 
mating pheromone leader sequence." 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic (derived from Saccharomyces cerevisiae) 
55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 
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TTTTATTCG C AGCCTCGAGC GCATTAGCTG CTCCAGTCAA CACTACAACA GAAGATGAAA 60 
CGG C ACAAA T TCCGGCTAAA A 81 

5 

(2) INFORMATION FOR SEQ ID NO:23: 
(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 90 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "This sequence is the complementing strand of SEQ ID NO:1 . It is submitted 
to illustrate the two terminal overhangs produced after assembly." 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic (derived from Saccharomyces cerevisiae) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

25 



GATCTTTTAG CCGGAATTTG TGCCGTTTCA TCTTCTGTTG TAGTGTTGAC TGGAGCAGCT 60 
AATGCGCTCG AGGCTGCGAA 7AAAACTGCA 90 

30 

(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

35 

(A) LENGTH: 1845 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic chimera" 
45 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 
(ix) FEATURE: 

50 

(A) NAME/KEY: CDS 

(B) LOCATION: 1115. .1558 

(ix) FEATURE: 

55 

(A) NAME/KEY: promoter 

(B) LOCATION: 1 ..1114 

(D) OTHER INFORMATION: /standard_name= "ADH/GAP promoter" 
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(ix) FEATURE: 



(A) NAME/KEY: miscjeature 

(B) LOCATION: 1115. .1225 

(D) OTHER INFORMATION: /function= "mediates secretion of rhPDGF-B" 

/product= "truncated alpha factor leader/signal 

peptide" 

/standard_name= "alpha factor leader/signal 
sequence" 



(ix) FEATURE: 



(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1226.. 1558 

(D) OTHER INFORMATION: /product= "rhPDGF-B peptide" /standard_name= "rhPDGF-B" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



GGATCCTTCA ATATGCGCAC ATACGCTGTT ATGTTCAAGG TCCCTTCGTT TAAGAACGAA 
AGCGGTCTTC CTTTTGAGGG ATGTTTCAAG TTGTTCAAAT CTATCAAATT TGCAAATCCC 
CAGTCTCTAT CTAGCTAGAT ATACCAATGG C AAACTG AG C ACAACAATAC CAGTCCGGAT 
CAACTGGCAC CATCTCTCCC GTAGTCTCAT CTAATTTTTC TTCCGGATGA GGTTCCAGAT 
ATACCGCAAC ACCTTTATTA TGGTTTCCCT GAGGGAATAA TAGAATGTCC CATTCGAAAT 
CACCAATTCT AAACCTGGGC G AATTG T ATT TCGGGTTTGT TAACTCGTTC CAGTCAGGAA 
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TGTTCCACGT GAAGCTATCT TCCAGCAAAG TCTCCACTTC TTCATCAAAT TGTGGGAGAA 420 

TACTCCCAAT GCTCTTATCT ATGGGACTTC CGGGAAACAC AG T AC CG AT A CTTCCCAATT 4 60 

5 

CGTCTTCAGA GCTCATTGTT TGTTTGAAGA GACTAATCAA AGAATCGTTT TCTCAAAAAA 540 

ATTAATATCT TAACTGATAG TTTGATCAAA GGGGCAAAAC GTAGGGGCAA ACAAACGGAA 600 

10 AAATCGTTTC TCAAATTTTC TGATGCCAAG AACTCTAACC AGTCTTATCT AAAAATTGCC 6o0 

TTATGATCCG TCTCTCCGGT TACAGCCTGT GTAACTGATT AATC CTGC CT TTCTAATCAC 72 0 

CATTCTAATG TTTTAAITAA GGGATTTTGT CTTCATTAAC GGCTTTCGCT CATAAAAATG 780 

15 

TTATGACGTT TTGCCCG C AG GCGGGAAACC ATCCACTTCA CGAGACTGAT CTCCTCTGCC 840 

GGAACACCGG GCATCTCCAA CTTATAAGTT GGAGAAATAA GAGAArTTCA GATTGAGAGA 90 0 

2Q AT G AAAAAAA AAAACCCTGA AAAAAAAGGT TGAAACCAGT TCCCTGAAAT TATTCCCCTA 9S0 

CTTGACTAAT AAG TATAT AA AGACGGTAGG TATTGATTGT AATTCTGTAA ATCTATTTCT 102 0 

TAAAC TTCTT AAA ITC TACT TTT AT A G TT A GTCTTTTTTT TAG TTTT AAA ACACCAAGAA 10 SO 

25 CTTAGTTTCG AATAAACACA TAAAC AAA CACC ATG AG A TTT CCT TCA ATT 113 2 

i-tst Arg Phe Pro Ser lis 
-37 - 3 5 

TTT ACT GCA GTT TTA TTC GCA GCC TCG AGC GCA TTA GCT GCT CCA GTC 11-30 
50 Phe Thr Ala Val Leu Phe Ala Ala Ser Ser Ala Leu .Ala fcla Pro Val 

-30 -25 - 2 0 

.AAC ACT AC A ACA GAA GAT GAA ACG GCA CAA ATT CCG GCT AAA AGA TCT 1223 
Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin XI e Pro Ala Lys Arg Ser 
-15 -10 -5 1 

35 

TTG GGT TCT TTG ACT ATC GCT GAA CCA GCT ATG ATC GCT GAA TGT AAG 12 76 

Leu Gly Ser Leu Thr lie Ala Glu Pro Ala Met lie Ala Glu Cys Lys 

5 10 15 

40 ACT AGA ACT GAA GTT TTC GAA ATC TCC AGA AGA TTG ATC GAC AGA ACT 1324 

Thr Arg Thr Glu Val Phe Glu lie Ser Arg Arg Leu lie Asp Arg Thr 
20 25 30 

AAC GCT AAC TTC TTG GTT TGG CCA CCA TGT GTT GAA GTT CAA AGA TGT 13 72 

Asn Ala Asn Phe Leu Val Trp Pro Pro Cys Val Glu Val Gin Arg Cys 
45 35 40 45 

TCT GGT TGT TGT AAC AAC AGA AAC GTT CAA TGT AGA CCA ACT CAA GTT 1420 
Ser Gly Cys Cys Asn Asn Arg Asn Val Gin Cys Arg Pro Thr Gin Val 
50 55 60 65 

50 

CAA TTG AGA CCA GTT CAA GTT AGA AAG ATC GAA ATC GTT AGA AAG AAG 1468 
Gin Leu Arg Pro Val Gin Val Arg I*ys lie Glu lie Val Arg Lys Lys 

70 75 80 

55 



31 



EP 0 946 736 B1 



CCA ATC TTC AAG AAG GCT ACT GTT ACT TTG GAA GAC CAC TTG GCT TGT 1516 
Pro lie Phe Lys Lys Ala Thr Val Thr Leu Glu Asp His Leu Ala Cys 

85 90 95 

AAG TGT GAA ACT GTC GCC GCT GCC AGG CCA GTT ACT TAA TAG 1558 
Lys Cys Glu Thr Val Ala Ala Ala Arg Pro Val Thr * * 
100 105 110 

CGTCGTCGAC TTTGTTCCCA CTGTACTTTT AGCTCG TACA AAATACAATA TACTTTTCAT 1618 

TTCTCCGTAA ACAACATGTT TTCCCATGTA ATATCCTTTT CTATTTTTCG TTCCGTTACC 16 78 

AACTTTACAC ATACTTTATA TAGCTATTCA CTTCTATACA CTAAAAAACT AAGACAATTT 173 8 

TAATTTTGCT GCCTGCCATA TTTCAATTTG TTATAAATTC CTATAATTTA TCCTATTAGT 17 9 3 

AG C T AAAAAA AGATGAATGT GAATCGAATC CTAAGAGAAT TCGGATC 18 4 S 



(2) INFORMATION FOR SEQ ID NO:25: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 148 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 



Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-37 -3.5 -30 -25 

Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr .Ala Gin 
-20 -15 -10 

lie Pro Ala Lys Arg Ser Leu Gly Ser Leu Thr lie Ala Glu Pro Ala 
-S 15 10 

Met He Ala Glu Cys Lys Thr Arg Thr Glu Val Phe Glu He Ser Arg 

15 20 25 

Arg Leu He Asp Arg Thr Asn Ala Asn Phe Leu Val Trp Pro Pro Cys 
30 35 40 

Val Glu Val Gin Arg Cys Ser Gly Cys Cys Asn Asn Arg Asn Val Gin 
45 50 55 

Cys Arg Pro Thr Gin Val Gin Leu Arg Pro Val Gin Val Arg Lys lie 
60 65 70 75 

Glu He Val Arg Lys Lys Pro He Phe Lys Lys Ala Thr Val Thr Leu 

80 85 90 

Glu Asp His Leu Ala Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro 

95 100 105 
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Val Thr * * 
110 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 621 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRI PTION : /desc = "This construct is a chimeric nucleic acid that contains a truncated yeast alpha 
factor leader sequence linked to the human PDGF prosequence and the human rhPDGF-B gene(cDNA)." 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae/Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..621 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 25.. 105 

(D) OTHER INFORMATION: /function= "Mediates secretion of human rhPDGF-B" 
/products "Saccharomyces cerevisiae alpha-factor 
leader/signal sequence" 

(ix) FEATURE: 

(A) NAME/KEY: transit_peptide 

(B) LOCATION: 112..288 

(D) OTHER INFORMATION: /function= "Mediates protein transport" 
/product= "human PDGF propeptide" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 289..621 

(D) OTHER INFORMATION: /product= "human PDGF-B peptide" /standard_name= "rhPDGF-B" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



33 



EP 0 946 736 B1 



10 



15 



20 



40 



45 



55 



ATG AGA TTT CCT TCA ATT TTT ACT GCA GTT TTA TTC GCA GCC TCG AGC 4 8 

Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-96 -95 -90 -85 

GCA TTA GCT GCT CCA GTC AAC ACT ACA ACA GAA GAT GAA ACG GCA CAA 96 
Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 
-80 -75 -70 -65 



ATT CCG GCT AAA AGA GAC CCC ATT CCC GAG GAG CTC TAC GAG ATG CTG 14 4 

lie Pro Ala Lys Arg Asp Pro lie Pro Glu Glu Leu Tyr Glu Met Leu 

-60 -55 -50 

AGT GAC CAC TCG ATC CGC TCC TTT GAT GAT CTC CAA CGC CTG CTG CAC 192 
Ser Asp His Ser lie Arg Ser Phe Asp Asp Leu Gin Arg Leu Leu His 

-45 -40 -35 

GGA GAC CCC GGA GAG GAA GAT GGG GCC GAG TTG GAC CTG AAC ATG ACC 24 0 

Gly Asp Pro Gly Glu Glu Asp Gly Ala Glu Leu Asp Leu Asn Hec Thr 
-30 -25 -20 



CGC TCC CAC TCT GGA GGC GAG CTG GAG AGC TTG GCT CGG GGG AAG AGG 238 
Arg Ser His Ser Gly Gly Glu Leu Glu Ser Leu Ala Arg Gly Lys Arg 
25 -15 -10 -5 

AGC CTG GGT TCC CTG ACC ATT GCT GAG CCG GCC ATG ATC GCC GAG TGC 3 35 

Ser Leu Gly Ser Leu Thr lie Ala Glu Pro Ala Mec lie Ala Glu Cys 

1 5 10 15 

10 

AAG ACG CGC ACC GAG GTG TTC GAG ATC TCC CGG CGC CTC ATA GAC CGC 3 84 

Lys Thr Arg Thr Glu Val Phe Glu lis Ser Arg Arg Leu lie Asp Arg 

20 25 30 

ACC AAC GCC AAC TTC CTG GTG TGG CCG CCC TGT GTG GAG GTG CAG CGC 43 2 

35 Thr Asa Ala Asn Phe Leu Val Trp Pro Pro Cys Val Glu Val Gin .Arg 

LoL ILL iol IbL AAL. .-UV~ .m~H.v_ a .1 u ^.i-v'a i '-J<. UU»L* V— ri>_ ^- La'j 4 & U 

Cys Ser Gly Cys Cys Asn Asn Arg Asa Val Gin Cys Arg Pro Thr Gin 
50 55 50 



GTG CAG CTG CGA CCT GTC CAG GTG AGA AAG ATC GAG ATT CTG CGG AAG 528 
Val Gin Leu Arg Pro Val Gin Val Arg Lys lie Glu lie Val Arc Lys 
65 70 75 B0 

AAG CCA ATC TTT AAG AAG GCC ACG GTG ACG CTG GAA GAC CAC CTG GCA 57 5 

Lys Pro lie Phe Lys Lys Ala Thr Val Thr Leu Glu Asp His Leu Ala 

B5 90 95 



TGC AAG TGT GAG ACA GTG GCA GCT GCA CGG CCT GTG ACC TAA TAG 621 
Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro Val Thr * * 

50 100 105 110 

(2) INFORMATION FOR SEQ ID NO:27: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 207 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 



Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-96 -95 -90 -B5 

Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 
-80 -75 -70 -65 

lie Pro Ala Lys Arg Asp Pro lie Pro Glu Glu Leu Tyr Glu Met Leu 

-60 -55 -50 

Ser Asp His Ser lie Arg Ser Phe Asp Asp Leu Gin Arg Leu Leu His 

• <— a v> -i r' 

-MV - _> ^ 

Gly Asp Pro Gly Glu Glu Asp Gly Ala Glu Leu Asp Leu Asn Met Thr 
-JO -25 " -20 

Arg Ser His Ser Gly Gly Glu Leu Glu Ser Leu Ala Arg Gly Lys Arg 
-15 -10 -5 

Ser Leu Gly Ser Leu Thr lie Ala Glu Pro Ala Met He Ala Glu Cys 
1 5 10 15 

Lys Thr Arg Thr Glu Val Phe Glu lie Ser Arg Arg Leu He Asp Arg 

2 0 2 5 3 0 

Thr Asn Ala nsn Phe Leu Val Trp Pro Pro Cys Val Glu Val Gin Arg 
3 5 4 C 4 5 

Cys Ser Gly Zys Cys Asn. A;;n Arg Asn Val Gin Cys Arg Pro Thr Gin 
SO 5 5 6" J 



V 



•'al Gin Leu Arg PrD Val Gin Val Arg Lys He Glu He Val Arg Lys 
5 5 70 75 8 0 



Lys Pro He Phe Lys Lys Ala Thr Val Thr Leu Glu Asp His Leu Ala 

35 90 95 

Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro Val Thr 

100 105 110 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1320 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic chimera" 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 454.. 11 79 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 454..519 

(D) OTHER INFORMATION: /product= "PDGF-B prepeptide" /standard_name= "PDGF-B presequence" 
(ix) FEATURE: 

(A) NAME/KEY: transit_peptide 

(B) LOCATION: 45S..696 

(D) OTHER INFORMATION: /function= "mediates protein transport" 
/product= "PDGF-B propeptide" 
/standard_name= "PDGF-B prosequence" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 697.. 1023 

(D) OTHER INFORMATION: /product= "rhPDGF-B peptide" /standard_name= "rhPDGF-B" 
(ix) FEATURE: 

(A) NAME/KEY: transit_peptide 

(B) LOCATION: 1024.. 11 79 

(D) OTHER INFORMATION: /function= "mediates protein transport" 
/product= "PDGF-B propeptide" 
/standard_name= "PDGF-B prosequence" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
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10 



15 



30 



35 



40 



45 



50 



55 



GAATTCCCAG AAAATG IT G C AAAAAAGCTA AGCCGGCGGG CAGAGGAAAA CGCCTGTAGC SO 

CGGCGAGTGA AGACGAAGCA TCGACTGCCG TGTTCCTTTT CCTCTTGGAG GTTGGPvGTCC 12 0 

CCTG3GCGCC CCCACACGGC TAG ACGC G T C GGCTGGTTCG CGACGCAGCC CCCCGGCCGX 180 

GGATGCTGCA CTCGGGCTCG GGATCCGCCC AGGTAGCGGC CTCGGACCCA GGTCCTGCGC 24 0 

CCAGGTCCTC CCCTGCCCCC CAGCGACGGA GCCGGGGCCG GGGGCGGCGG CGCCGGGGGC 3 00 

ATGCGGGTGA GCCGCGGCTG CAGAGGCCTG AGCGCCTGAT CGCCGCGGAC CCGAGCCGAG 3 60 

CCCACCCCCC TCCCCAGCCC CCCACCCTGG CCGCGGGGGC GGCGCGCTCG AT CTACGCG T 420 

TCGGGGCCCC GCGGGGCCGG GCCCGGAGTC GGC ATG AAT CGC TGC TGG GCG CTC 4 74 

Met Asn Arg Cys Trp Ala Leu 
- Bl -80 -75 



TTC CTG TCT CTC TGC TGC TAC CTG CGT CTG GTC AGC GCC GAG GGG GAC 522 

Phe Leu Ser Leu Cys Cys Tyr Leu Arg Leu Val Ser Ala Glu Gly Asp 
20 -70 -65 -60 

CCC ATT CCC GAG GAG CTT TAT GAG ATG CTG AGT GAC CAC TCG ATC CGC 570 

Pro He Pro Glu Glu Leu Tyr Glu Met Leu Ser Asp His Ser He Arg 

-55 -50 -45 

25 
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TCC TTT GAT GAT CTC CAA CGC CTG CTG CAC GGA GAC CCC GGA GAG GAA 
Ser Phe Asp Asp Leu Gin Arg Leu Leu His Gly Asp Pro Gly Glu Glu 
-40 -35 -30 

GAT GGG GCC GAG TTG GAC CTG AAC ATG ACC CGC TCC CAC TCT GGA GGC 
Asp Gly Ala Glu Leu Asp Leu Asn Met Thr Arg Ser His Ser Gly Gly 
-25 -20 -15 

GAG CTG GAG AGC TTG GCT CGT GGA AGA AGG AGC CTG GGT TCC CTG ACC 
Glu Leu Glu Ser Leu Ala Arg Gly Arg Arg Ser Leu Gly Ser Leu Thr 
-10 -5 15 

ATT GCT GAG CCG GCC ATG ATC GCC GAG TGC AAG ACG CGC ACC GAG GTG 
lie Ala Glu Pro Ala Met lie Ala Glu Cys Lys Thr Arg Thr Glu Val 

10 15 20 

TTC GAG ATC TCC CGG CGC CTC ATA GAC CGC ACC AAC GCC AAC TTC CTG 
Phe Glu He Ser Arg Arg Leu lie Asp Arg Thr Asa Ala Asn Phe Leu 
25 30 35 

GTG TGG CCG CCC TGT GTG GAG GTG CAG CGC TGC TCC GGC TGC TGC AAC 
Val Trp Pro Pr:> Cys Val Glu Val Gin Arg Cys Ser Gly Cys Cys Asn 
<*0 4 5 50 

AAC CGC AAC GTG CAG TGC CGC CCC ACC CAG GTG CAG CTG CGA CCT GTC 
Asn Arg Asn Val 3 In Cys Arg Pro Thr Gin. Val Gin .Leu Arg Pro Val 
5 5 60 £5 ~'0 

CAG GTG AGA AAG ATC GAG ATT GTG CGG AAG AAG CCA ATC TTT .AAG AAG 
Gin Val Arg Lys lie Glu lis Val Arg Lys Lys Pro He Phe Lys Lys 

73 8 0 3 5 

GCC ACG GTG ACG CTG GAA GAC CAC CTG GCA TGC AAG TGT GAG ACA GTG 
Ala Thr Val Thr Leu Glu Asp His Leu Ala Cys Lys Cys Glu Thr Val 

90 95 100 

GCA GCT GCA CGG* CCT GTG ACC CGA AGC CCG GGG GGT TCC CAG GAG CAG 
Ala Ala Ala Arg Pro Val Thr Arg Ser Pro Gly Gly Ser Gin Glu Gin 
105 1X0 115 

CGA GCC AAA ACG CCC CAA ACT CGG GTG ACC ATT CGG ACG GTG CGA GTC 
Arg Ala Lys Thr Pro Gin Thr Arg Val Thr He Arg Thr Val Arg Val 
120 125 130 

CGC CGG CCC CCC AAG GGC AAG CAC CGG AAA TTC AAG CAC ACG CAT GAC 
Arg Arg Pro Pro Lys Gly Lys His Arg Lys Phe Lys His Thr His Asp 
135 140 145 150 

AAG ACG GCA CTG AAG GAG ACC CTT GGA GCC TAG GGGCATCGGC AGGAG AG TGT 
Lys Thr Ala Leu Lys Glu Thr Leu Gly Ala * 

155 160 

GTGGGCAGGG TTATTTAATA TGGTATTTGT GTATTGCCCC CATGGGGCCT TGGAGTAGAT 

AATATTGTTT CCCTCGTCCG TCTGTCTCGA TGCCTGATTC GGACGGCCAA TGGTGCCTCC 



(2) INFORMATION FOR SEQ ID NO:29: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 



Met: Asn Arg Cys Trp Ala Leu Phe Leu Ser Leu Cys Cys Tyr Leu Arg 
-Bl -80 -75 -70 

Leu Val Ser Ala Glu Gly Asd Pro lie Pro Glu Glu Leu Tvr Glu Me- 

-55 -60 -55 -50 

Leu Ser Asp His Ser lie Arg Ser Phe Asp Asp Leu Gin Arg Leu Leu 

-45 -40 -35 

His Gly Asp Pro Gly Glu Glu Asp Gly Ala Glu Leu Asp Leu Asn Met 



-3D -25 -2i 



Thr Arc Ser Krs Sar Gly Gly Glu Leu Glu Ser .Leu Ala Arg Gly Arg 



1 5 - 1 0 



Arg Ser Leu Gly Ser Leu Thr lie Ala Glu Pro Ala Met Hie Ala Glu 
1 .5 10 15 

Cvs Lvs Thr Arc? Thr Glu Val Phe Glu lie Sir Arg Arc Leu lie Asd 

20 25 30 

Arg Thr Asn Ala Asn Phe Leu Val Trp Pro Pro Cys Val Glu Val Gin 

35 40 45 

Arg Cys Ser Gly Cys Cys Asn Asn Arg Asn Val Gin Cys Arg Pro Thr 
50 55 60 

Gin Val Gin Leu Arg Pro Val Gin Val Arg Lys lie Glu lie Val Arg 
65 70 75 

Lys Lys Pre lie Phe Lys Lys Ala Thr Val Thr Leu Glu Asp His Leu 
80 85 90 95 

Ala Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro Val Thr Arg Ser 

100 105 110 

Pro Gly Gly Ser Gin Glu Gin Arg Ala Lys Thr Pro Gin Thr Arg Val 

115 120 125 

Thr lie Arg Thr Val Arg Val Arg Arg Pro Pro Lys Gly Lys His Arg 
130 135 140 

Lys Phe Lys His Thr His Asp Lys Thr Ala Leu Lys Glu Thr Leu Gly 
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145 150 155 

5 Ala 

160 

(2) INFORMATION FOR SEQ ID NO:30: 
10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic oligonucleotide (primer)" 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Synthetic (derived from Homo sapiens sequence) 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

50 (2) INFORMATION FOR SEQ ID NO:31 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

40 

(A) DESCRIPTION: /desc = "Synthetic oligonucleotide (primer)" 
(vi) ORIGINAL SOURCE: 
45 (A) ORGANISM: Synthetic (derived from Homo sapiens sequence) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

5Q CTTGGCTCGG GGGAAGAGGA GCCTGGG 21 

(2) INFORMATION FOR SEQ ID NO:32: 
(i) SEQUENCE CHARACTERISTICS: 

55 

(A) LENGTH: 89 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: other nucleic acid 
5 (A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae derived sequence 

10 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 44..89 

15 (D) OTHER INFORMATION: /function= "truncated alpha factor leader/lys-arg proc./N-term. propept" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 



20 TCGAGCGCA7 TAGCTGCTCC AGTCAACACT ACAACAGAAG ATGAAACGGC ACAAATTCCG 6 0 

GCTAAAAGAG ACCCCATTCC CGAGGAGCT 3 9 

(2) INFORMATION FOR SEQ ID NO:33: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

35 (A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens derived sequence 

40 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 1..39 

45 (D) OTHER INFORMATION: /function= "C-term. alpha factor leader/lys-arg proc./N-term. propeptide" 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:33: 



CCTCGGGAAT GGGGTCTCTT TTAGCCGGAA TTTGTGCCGT TTCATCTTCT GTTGTAGTGT 60 
TGACTGGAGC AGCTAATGCG C 81 



55 (2) INFORMATION FOR SEQ ID NO:34: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 
10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens derived sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

15 

CAAGTGTGAG ACAGTGGCAG CTGCACGGCC TGTGACCTAA TAGCGTCG 4 3 

(2) INFORMATION FOR SEQ ID NO:35: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 56 base pairs 

(B) TYPE: nucleic acid 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

30 (A) DESCRIPTION: /desc = "Synthetic oligonucleotide" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens derived sequence 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 



TCGACGACGC T ATT AG G T C A CAGGCCGTGC AGCTCCCACT GTCTCACACT TGCATG 5 6 

40 

(2) INFORMATION FOR SEQ ID NO:36: 

(i) SEQUENCE CHARACTERISTICS: 

45 

(A) LENGTH: 2023 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic chimera" 
55 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1115. .1735 

(ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION: 1 ..1114 

(D) OTHER INFORMATION: /standard_name= "ADH/GAP promoter" 
(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1115. .1225 

(D) OTHER INFORMATION: /function= "mediates secretion of rhPDGF-B" 
/product= "alpha factor signal/truncated alpha 
factor leader peptide" 
/standard_name= "truncated alpha factor 
signal/leader sequence" 

(ix) FEATURE: 

(A) NAME/KEY: transit_peptide 

(B) LOCATION: 1226.. 1402 

(D) OTHER INFORMATION: /product= "PDGF-B propeptide" /standard_name= "PDGF-3 proseque 
(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 1403.. 1735 

(D) OTHER INFORMATION: /product= "rhPDGF-B protein" /standard_name= "rhPDGF-B" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 
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10 



GGATCCTTCA ATATG -GCAC ATACGCTGTT ATGTTCAAGG TCCCTTCGTT T AAG mACG AA SO 

AGcaorcrrc cttttgaggg atg tttc a ag rrarTCAAA*? ctatcaaatt tgcaaatccc 120 

CAGTCTGTAT 1* T AGCTAG AT ATACCAACGG CAAACTGAGC ACAACAATAC CAGTCCGGAT 180 

CAACTGGCAC CATCTCTCCC GTAGTGrCAT CTAATTTTTC TTCCGGAT3A GGTTCCAGAT 24 0 

ATACCGCAAC AC C TTTATT A TGGTTTCCCT GAGGGAATAA TAGAATGTCC CATTCGAAAT 3 00 

CACCAATTCT AAAC CTGGGC GAATTGTATT TZGGGTTTCT TAACTCGTTC CAGTCAGGAA 3 60 

TGTTCCACGT GAAGCTATCT TCCAGCAAAG TCTCCACTTC TTCATCAAAT TGTGGGAGAA 42 0 

15 TACTCCCAAT GCTCTTATCT ATGGGACTTC CGGGAAACAC AGTACCGATA CTTCCCAATT 4 30 

CGTCTTCAGA GCXCATTGTT TGTTTGAAGA GACTAATCAA AGAATCGTTT TCTCAAAAAA 54 0 

ATTAATATCT TAACTGATAG TTTGATCAAA GGGGCAAAAC GTAGGGGCAA ACAAACGGAA 600 

AAATCGTTTC TCAAATTTTC TGATGCCAAG AACTCTAACC AGTCTTATCT AAAAATTGCC 660 

TTATGATCCG TCTCTCCGGT TACAG CCTGT GTAACTCATT AAXCCTGCCT TTCTAATCAC 720 

CATTCTAATG TTTTAATTAA GGGATTTTGT CTTCATTAAC GGCTTTCGCT CATAAAAATG 780 

TTATGACGTT TTGCCCGCAG GCGGG AAACC ATCCACTTCA CG AGACTGAT CTCCTCTGCC 840 

GGAACACCGG GCATCTCCAA CTTATAAGTT GGAGAAATAA GAGAATT TCA GATTGAGAGA 900 

30 ATGAAAAAAA AAAACCCTGA AAAAAAAGGT TGAAACCAGT TCCCTGAAAT TATTCCCCTA 960 



20 



25 



35 



40 



45 



50 



55 
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CTTGACTAAT AAGTATATAA AGACGGTAGG TATTGATTGT AATTCTGTAA ATCTATTTCT 

TAAACTTCTT AAATTCTACT TTTATAGTTA GTCTTTTTTT TAGTTTTAAA ACACCAAGAA 

CTTAGTTTCG AATAAACACA CATAAACAAA CACC ATG AGA TTT CCT TCA ATT 

Met Arg Phe Pro Ser He 
-96 -95 

TTT ACT GCA GTT TTA TTC GCA GCC TCG AGC GCA TTA GCT GCT CCA GTC 
Phe Thr Ala Val Leu Phe Ala Ala Ser Ser Ala Leu Ala Ala Pro Val 
-90 -85 -80 -75 

AAC ACT AC A ACA GAA GAT GAA ACG GCA CAA ATT CCG GCT AAA AGA GAC 

Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin He Pro Ala Lys Arg Asp 

-70 -65 -SO 

CCC ATT CCC GAG GAG CTC TAG GAG ATG CTG AGT GAC CAC TCG ATC CGC 
Pro He Pro Glu Glu Leu Tyr Glu Mer Leu Ser Asp His Ser He Arg 

-55 -50 -45 

TCC TTT GAT GAT CTC CAA CGC 'CTG CTG CAC GGA GAC CCC GGA GAG GAA 
Ser Pile Asp Asp Leu Gin Arg Leu Leu Kis Gly Asp Pro Gly Glu Glu 
-40 -35 -30 

GAT GGG GCC GAG TTG GAC CTG AAC ATG ACC CGC TCC CAC TCT GGA GGC 
Asp Gly Ala Glu Leu Asp Leu Asn Met: Thr Arg Ser Kis Ser Gly Gly 
-25 -20 -15 

GAG CTG GAG AGC TTG GCT CGG GGG AAG AGG AG Z CTG G3T TCC CTG ACC 
Glu Leu Glu Ser Leu Ala Arg Gly Lys Arg Ser Leu Gly Ser Leu Thr 
-10 -5 15 

ATT GCT GAG CCG GCC ATG ATC GCC GAG TGC AAG ACG CGC ACC GAG GTG 
He Ala Glu Pro Ala Met He Ala Glu Cys Lys Thr Arg Thr Glu Val 

10 15 20 

TTC GAG ATC TCC CGG CGC CTC ATA GAC CGC ACC AAC GCC AAC TTC CTG 
Phe Glu He Ser Arg Arg Leu He Asp Arg Thr Asn Ala Asn Phe Leu 
25 30 35 

GTG TGG CCG CCC TGT GTG GAG GTG CAG CGC TGC TCC GGC TGC TGC AAC 

Val Trp Pro Pro Cys Val Glu Val Gin Arg Cys Ser Gly Cys Cys Asn 
40 45 50 

AAC CGC AAC GTG CAG TGC CGC CCC ACC CAG GTG CAG CTG CGA CCT GTC 
Asn Arg Asn Val Gin Cys Arg Pro Thr Gin Val Gin Leu Arg Pro Val 
55 60 65 70 

CAG GTG AGA AAG ATC GAG ATT GTG CGG AAG AAG CCA ATC TTT AAG AAG 
Gin Val Arg Lys He Glu He Val Arg Lys Lys Pro He Phe Lys Lys 

75 80 85 

GCC ACG GTG ACG CTG GAA GAC CAC CTG GCA TGC AAG TGT GAG ACA GTG 
Ala Thr Val Thr Leu Glu Asp His Leu Ala Cys Lys Cys Glu Thr Val 

90 95 100 

GCA GCT GCA CGG CCT GTG ACC TAA TAG CGTCGTCGAC TTTGTTCCCA 
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Ala Ala Ala Arg Pro Val Thr * * 
105 110 

5 CTGTACTTTT AGCTCGTACA AAATACAATA TACTTTTCAT TTCTCCGTAA ACAACATGTT 1815 

TTCCCATGTA ATATCCTTTT CTATTTTTCG TTCCGTTACC AACTTTACAC ATACTTTATA 1875 

TAG CTATTC A CTTCTATACA CTAAAAAACT AAGACAATTT TAATTTTGCT GCCTGCCATA 193 5 

10 

TTTCAATTTG TT ATAAATT C CTATAATTTA TCCTATTAGT AGCTAAAAAA AGATGAATGT 19 95 

GAATCGAATC CTAAGAGAAT TCGGATCC 2023 

15 (2) INFORMATION FOR SEQ ID NO:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: protein 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 : 



Mec Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 

~9b -35 -30 -35 

Ala Leu Ala Ala Pro Val Asa Thr Thr Thr Glu Asp Glu Thr Ala Gin 

-90 -75 -70 -65 



lie 

35 

Ser 
Gly 

40 

Arg 

45 Se r 

1 



Pro Ala Lys Arg Asp 

-60 

Asp His Ser lis Arg 
-4 5 

Asp Pro Gly Glu Glu 
-30 

Ser His Ser Gly Gly 
-15 

Leu Gly Ser Leu Thr 

5 



Pro lie Pro Glu Glu 

-55 

Ser Phe Asp Asp Leu 
-40 

Asp Gly Ala Glu Leu 
-25 

Glu Leu Glu Ser Leu 
-10 

He Ala Glu Pro Ala 

10 



Leu Tyr Glu Met Leu 

- 50 

Gin Arg Leu Leu His 
-35 

Asp Leu Asn Met: Thr 
-20 

Ala Arg Gly Lys Arg 

- o 

Met: lie Ala Glu Cys 

15 



Lys Thr Arg Thr 

20 

Thr Asn Ala Asn 
35 



Glu Val Phe Glu 

Fhe Leu Val Trp 

40 



He Ser Arg Arg 
25 

Pro Pro Cys Val 



Leu He Asp Arg 
30 

Glu Val Gin Arg 
45 



Cys Ser Gly Cys Cys Asn Asn Arg Asn Val Gin Cys Arg Pro Thr Gin 
50 55 60 

55 

Val Gin Leu Arg Pro Val Gin Val Arg Lys He Glu He Val Arg Lys 
6S 70 75 80 
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Lys Pro He Phe Lys Lys Ala Thr Val Thr Leu Glu Asp His Leu Ala 

85 90 95 

Cys Lys Cys Glu Thr Val Ala Ala Ala Arg Pro Val Thr * * 

100 105 110 



(2) INFORMATION FOR SEQ ID NO:38: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 480 base pairs 

(B) TYPE: nucleic acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

20 (A) DESCRIPTION: /desc = "Chimeric DNA molecule" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 

25 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..471 

30 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 1..255 

35 (D) OTHER INFORMATION: /function= "mediates protein secretion" 

/product= "Yeast alpha factor leader peptide" 
/standard_name= "Alpha faccor signal/leader 
sequence" 

40 (ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 255..471 

(D) OTHER INFORMATION: /product= "rhIGF-l-A protein" /standard_name= "rhIGF-l-A" 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 



50 



55 
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ATG AGA TTT CCT TCA ATT TTT ACT GCA GTT TTA TTC GCA GCA TCC TCC 
Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-85 -80 - 7S -70 

GCA TTA GCT GCT CCA GTC AAC ACT ACA ACA GAA GAT GAA ACG GCA CAA 
Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 

-65 -60 -55 

ATT CCG GCT GAA GCT GTC ATC GGT TAC TTA GAT TTA GAA GGG GAT TTC 
lie Pro Ala Glu Ala Val He Gly Tyr Leu Asp Leu Glu Gly Asp Phe 

-50 -45 -40 



GAT GTT GCT GTT TTG CCA TTT TCC AAC AGC ACA AAT AAC GGG TTA TTG 
Asp Val Ala Val Leu Pro Phe Ser Asn Ser Thr Asn Asn Gly Leu Leu 
-35 -30 -25 

TTT ATA AAT ACT ACT ATT GCC AGC ATT GCT GCT AAA GAA GAA GGG GTA 
Phe lie Asn Thr Thr He Ala Ser lie Ala Ala Lys Glu Glu Gly Val 
-20 -15 -10 

CAG CTG GAT AAA AGA GGT CCA GAA ACC TTG TGT GGT GCT GAA TTG GTC 
Gin Leu Asp Lys Arg Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val 
-5^1 5 10 

GAT GCT TTG CAA TTC GTT TGT GGT GAC AGA GGT TTC TAC TTC AAC AAG 

Asp Ala Leu Gin Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys 

15 20 25 

CCA ACC GGT TAC GGT TCT TCT TCT AGA AGA GCT CCA CAA ACC GGT ATC 
Pro Thr Gly Tyr Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly He 
30 35 40 

GTT GAC GAA TGT TGT TTC AGA TCT TGT GAC TTG AGA AGA TTG GAA ATG 
Vai Asp Glu Cys Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met 
4 .5 5 0 5 5 

TAC TCT GCT CCA TTG AAG CCA GCT AAG TCT GCT TGA TAA GTCGACTTT 
Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser Ala * 
5 0 6 5 70 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 
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Met Arg Phe Pro 
-85 

Ala Leu Ala Ala 



lie Pro Ala Glu 

-50 

Asp Val Ala Val 
-35 

Phe lie Asn Thr 
-20 

Gin Leu Asp Lys 
-5 



Ser lie Phe Thr 
-80 

Pro Val Asn Thr 
-65 

Ala Val lie Gly 



Leu Pro Phe Ser 

-30 

Thr He Ala ser 
-15 

Arg Gly Pro Glu 
1 



Ala Val Leu Phe 
-75 

Thr Thr Glu Asp 
-60 

Tyr Leu Asp Leu 
-45 

Asn Ser Thr Asn 



lie Al a Al a Lys 

-10 

Thr Leu Cys Gly 
5 



Ala Ala Ser Ser 

Glu Thr Ala Gin 
-55 

Glu Gly Asp Phe 
-40 

Asn Gly Leu Leu 
-25 

Glu Glu Gly Val 



Ala Glu Leu Val 
10 



Asp Ala Leu Gin Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys 

15 20 25 

Pro Thr Gly Tyr Gly Ser Ser Ser Arg Arg Ala pro Gin Thr Gly He 
30 35 40 

Val Asp Glu Cys Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met 
45 50 55 

Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser Ala * * 
60 65 70 



(2) INFORMATION FOR SEQ ID NO:40: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 621 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Chimeric DNA molecule" 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens/Saccharomyces cerevisiae 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..579 

(ix) FEATURE: 

(A) NAME/KEY: mise feature 
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(B) LOCATION: 1..255 

(D) OTHER INFORMATION: /function= "mediates secretion of protein" 

/product= "3'end of yeast alpha factor leader 

peptide" 

/standard_name= "alpha factor leader/signal 
sequence" 



(ix) FEATURE: 



(A) NAME/KEY: mat_peptide 

(B) LOCATION: 256..471 

(D) OTHER INFORMATION: /product= "rhIGF-l-A protein" /standard name= "rhIGF-l-A" 



(ix) FEATURE: 



(A) NAME/KEY: transit_peptide 

(B) LOCATION: 472..579 

(D) OTHER INFORMATION: /function= "mediates protein transport/translocation" 
/product= "IGF-I-A propeptide" 
/standard_name= "IGF-I-A prosequence" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
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ATG AGA TTT CCT TCA ATT TTT ACT GCA GTT TTA TTC GCA GCA TCC TCC 
Met Arg Phe Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-85 -80 -75 -70 

GCA TTA GCT GCT CCA GTC AAC ACT ACA ACA GAA GAT GAA ACG GCA CAA 
Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 

-65 -60 -55 

ATT CCG GCT GAA GCT GTC ATC GGT TAG TTA GAT TTA GAA GGG GAT TTC 
lie Pro Ala Glu Ala Val lie Gly Tyr Leu Asp Lea Glu Gly Asp Phe 

-50 -45 -40 

GAT GTT GCT GTT TTG CCA TTT TCC AAC AGC ACA AAT AAC GGG TTA TTG 
Asp Val Ala Val Leu Pro Phe Ser Asn Ser Thr Asn Asn Gly Leu Leu 
-35 -30 -25 

TTT ATA AAT ACT AC T ATT GCC AGC ATT GCT GCT AAA GAA GAA GGG GTA 
Phe lie Asn Thr Thr lis Ala Ser lis Ala Ala Lys Glu Glu Gly Val 
-20 -15 -10 

CAG CTG GAT AAA AGA GGT CCA GAA ACC TTG TGT GGT GCT GAA TTG GTC 
Gin Leu Aso Lvs Arg Gly Pro Glu Thr Leu Cvs Glv Ala Glu Leu Val 
-5 1 S 10 

GAT GCT TTG CAA TTC GTT TGT GGT GAC AGA GGT TTC TAG TTC AAC AAG 
Asp A 1 a Leu Gin Phe Val Cys Glv Asp Arg Gly phe Tyr Phe Asn Lys 

1 3 2 0 2 5 

CCA ACC GGT TAG GGT TCT TCT TCC AGA AGA GCT CCA CAA ACC GGT ATC 
Pro Thr Gly Tyr Gly Ser Ser ser Arg Arg Ala Pro Gin Thr Gly Tie 
30 35 AO 

GTT GAC GAA TGT TGT TTC AGA TCT TGT GAC TTG AGA AGA TTG GAA ATG 
Val Asp Glu Cys Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met 
4 5 50 5 5 

TAC TGT GCT CCA TTG AAG CCT GCT AAG TCT GCT AAA AGA TCC GTC AGA 
Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser Ala Lys Arg Ser Val Arg 

60 65 70 75 

GCT CAA AGA CAC ACC GAT ATG CCA AAG ACC CAA AAG GAA GTT CAC TTG 
Ala Gin Arg His Thr Asp Mec Pro Lys Thr Gin Lys Glu Val His Leu 

SO 85 90 

AAG AAC GCT TCC AGA GGT TCT GCT GGT AAC AAG AAC TAC AGA ATG TGA 
Lys Asn Ala Ser Arg Gly Ser Ala Gly Asn Lys Asn Tyr Arg Met * 

95 100 105 

TAA GTCGACTTTG TTCCCACTGT ACTTTTAGCT CGTACAAAAT AC 



(2) INFORMATION FOR SEQ ID NO:41 : 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 193 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 



Met Arg ?he Pro Ser lie Phe Thr Ala Val Leu Phe Ala Ala Ser Ser 
-35 -80 -75 -70 

Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 

-65 -60 -55 

lie Pro Ala Glu Ala Val lie Gly Tyr Leu Asp Leu Glu Gly Asp Phe 

-50 -45 -40 

Asp Val Ala Val Leu Pro Phe Ser Asn Ser Thr Asn Asr. Gly Leu Leu 
-35 -30 -25 

Phe lie Asn Thr Thr He Ala Ser He Ala Ala Lys Glu Glu Gly Val 
-20 -15 -10 



Gin Leu Asp Lys Arg 01 v Pro Glu Thr Leu Cys Gly Ala Glu Leu Val 
-5 1 5 10 

.Asp Alii Leu Gin Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asr. Lys 

15 2 0 2 5 

Pro Thr Gly Tyr Gly Ser Ser Ser Arg Arg Ala Pro Gin Thr Gly Tie 
30 35 40 

Val Asp Glu Cys Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met 
45 50 55 

Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser Ala Lys Arg Ser Val Arg 
60 65 70 75 

Ala Gin Arg Kis Thr Asp Met Pro Lys Thr Gin Lys Glu Val Kis Leu 

80 95 90 

Lys Asn Ala Ser Arg Gly Ser Ala Gly Asn Lys Asn Tyr Arg Met 

95 100 105 



Claims 

1. A nucleotide sequence comprising in the 5' to 3' direction and operably linked (a) a yeast-recognized transcription 
and translation initiation region, (b) a coding sequence for a hybrid precursor polypeptide, and (c) a yeast-recog- 
nized transcription and translation termination region, wherein said hybrid precursor polypeptide comprises: 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3' 

wherein: 

SP comprises a signal peptide sequence for a yeast secreted protein; 

PS comprises a preferred processing site cleaved in vivo by a yeast proteolytic enzyme; 

LP comprises a leader peptide sequence for a yeast secreted protein; 

NPRO MHP comprises a native N-terminal propeptide sequence of a mature heterologous mammalian protein 
of interest; 
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MHP comprises a peptide sequence for said mature beterologous mammalian protein of interest; 
CPRO MHP comprises a native C-terminal propeptide sequence of said mature heterologous mammalian pro- 
tein of interest; and 

5 n-1 , n-2, n-3, and n-4 independently = 0 or 1 ; 

wherein said processing sites allow for proteolytic processing of said precursor polypeptide to said mature protein 
in vivo by a yeast host cell, and wherein at least n-3 or n-4 = 1 . 

2. The nucleotide sequence of claim 1, wherein said mammalian protein is a PDGF protein or an IGF protein, or 
10 variants thereof. 

3. The nucleotide sequence of claim 2, wherein said protein is a human protein. 

4. The nucleotide sequence of claim 3, wherein said human PDGF is PDGF-BB or variants thereof. 

15 

5. The nucleotide sequence of claim 4, wherein SP is a signal peptide sequence for a Saccharomyces cerevisiae (DC- 
factor. 

6. The nucleotide sequence of claim 5, wherein said oc-factor is Mat a or variants thereof. 

20 

7. The nucleotide sequence of claim 6, wherein n-2 = 1 , n-3 = 1 , and n-4 = 0. 

8. The nucleotide sequence of claim 7, wherein LP is a truncated leader peptide sequence. 

25 9. The nucleotide sequence of claim 8, wherein said coding sequence for the hybrid precursor polypeptide has the 
nucleotide sequence set forth in SEQ ID NO. 26. 

10. The nucleptide sequence of claim 8, wherein said hybrid precursor polypeptide has the amino acid sequence set 
forth in SEQ ID NO. 27. 

30 

1 1 . The nucleotide sequence of claim 3, wherein n-3 = 0 and n-4 = 1 and said human IGF protein is IGF-I-A or variants 
thereof. 

12. The nucleotide sequence of claim 11, wherein SP is a signal peptide sequence for a Saccharomyces cerevisiae 
35 a-f actor. 

13. The nucleotide sequence of claim 1 2, wherein said a-factor is Mata or variants thereof. 

14. The nucleotide sequence of claim 1 3, wherein said coding sequence for said hybrid precursor polypeptide has the 
40 nucleotide sequence set forth in SEQ ID NO. 40. 

15. The nucleotide sequence of claim 13, wherein said hybrid precursor polypeptide has the amino acid sequence set 
forth in SEQ ID NO. 41. 

45 16. A vector comprising a nucleotide sequence that comprises in the 5' to 3' direction and operably linked (a) a yeast- 
recognized transcription and translation initiation region, (b) a coding sequence for a hybrid precursor polypeptide, 
and (c) a yeast-recognized transcription and translation termination region, wherein said hybrid precursor polypep- 
tide comprises: 
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5--SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3 I 



wherein: 



55 SP comprises a signal peptide sequence for a yeast secreted protein; 

PS comprises a preferred processing site cleaved in vivo by a yeast proteolytic enzyme; 
LP comprises a leader peptide sequence for a yeast secreted protein; 

NPRO MHP comprises a native N-terminal propeptide sequence of a mature heterologous mammalian protein 
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of interest; 

MHP comprises a peptide sequence for said mature heterologous mammalian protein of interest; 
CPRO MHP comprises a native C-terminal propeptide sequence of said mature heterologous mammalian pro- 
tein of interest; and 

5 

n-1 , n-2, n-3, and n-4 independently = 0 or 1 ; 
wherein said processing sites allow for proteolytic processing of said precursor polypeptide to said mature protein 
in vivo by a yeast host cell, and wherein at least n-3 or n-4 = 1 . 

10 17. The vector of claim 1 6, wherein said vector is the yeast shuttle vector pAB24. 

18. A yeast host cell stably transformed with a nucleotide sequence comprising an expression cassette, said cassette 
comprising in the 5' to 3' direction and operably linked (a) a yeast-recognized transcription and translation initiation 
region, (b) a coding sequence for a hybrid precursor polypeptide, and (c) a yeast-recognized transcription and 

15 translation termination region, wherein said hybrid precursor polypeptide comprises: 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3 I 

20 wherein: 

SP comprises a signal peptide sequence for a yeast secreted protein; 
PS comprises a preferred processing site cleaved in vivo by a yeast proteolytic enzyme; 
LP comprises a leader peptide sequence for a yeast secreted protein; 
25 NPRO MHP comprises a native N-terminal propeptide sequence of a mature heterologous mammalian protein 

of interest; 

MHP comprises a peptide sequence for said mature heterologous mammalian protein of interest; 
CPRO MHP comprises a native C-terminal propeptide sequence of said mature heterologous mammalian pro- 
tein of interest; and 

30 

n-1 , n-2, n-3, and n-4 independently = 0 or 1 ; 

wherein said processing sites allow for proteolytic processing of said precursor polypeptide to said mature protein 
35 in vivo by a yeast host cell, and wherein at least n-3 or n-4 = 1 . 

19. The cell of claim 1 8, wherein said processing sites are dipeptides cleaved by the KEX2 gene product of Saccha- 
romyces. 

40 20. The cell of claim 1 9, wherein said dipeptides are 5'-Lys-Arg-3'. 

21. The cell of claim 20, wherein said yeast cell is from the genus Saccharomyces. 

22. The cell of claim 21 , wherein said yeast cell is S. cerevisiae. 

45 

23. A method for expression of heterologous proteins and their secretion in the biologically active mature form using 
a yeast host cell as the expression system, said method comprising transforming said yeast cell with a vector 
comprising a nucleotide sequence that comprises in the 5' to 3' direction and operably linked (a) a yeast-recognized 
transcription and translation initiation region, (b) a coding sequence for a hybrid precursor polypeptide, and (c) a 

50 yeast-recognized transcription and translation termination region, wherein said hybrid precursor polypeptide com- 

prises: 

5•-SP-(PS) ^ . 1 -(LP-PS) ^ . 2 -(NPRO MH p-PS) ^ .3-MHP-(PS-CPRO MH p) ^ . 4 -3• 

55 

wherein: 

SP comprises a signal peptide sequence for a yeast secreted protein; 
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PS comprises a preferred processing site cleaved in vivo by a yeast proteolytic enzyme; 
LP comprises a leader peptide sequence for a yeast secreted protein; 

NPRO MHP comprises a native N-terminal propeptide sequence of a mature heterologous mammalian protein 
of interest; 

MHP comprises a peptide sequence for said mature heterologous mammalian protein of interest; 
CPRO MHP comprises a native C-terminal propeptide sequence of said mature heterologous mammalian pro- 
tein of interest; and 

n-1 , n-2 n-3, and n-4 independently = 0 or 1 ; 

wherein said processing sites allow for proteolytic processing of said precursor polypepride to said mature protein 
in vivo by a yeast host cell, and wherein at least n-3 or n-4 = 1 . 



Patentanspruche 

1. Nucleotidsequenz, die in 5' 3'-Richtung und funktionell gebunden (a) eine von Hefe erkannte Transkriptions- 
und Translations-lnitiationsregion, (b) eine codierende Sequenz fur ein Hybrid-Vorlauferpolypeptid und (c) eine 
von Hefe erkannte Transkriptions- und Translations-Terminationsregion umfasst, wobei das Hybrid-Vorlauferpo- 
lypeptid umfasst: 

5'-SP-(PS) n _ 1 -(LP-PS) n _ 2 -(NPRO MHp -PS) n _3-MHP-(PS-CPRO MHP ) n . 4 -3« 

worin 

SP eine Signalpeptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

PS eine bevorzugte Bearbeitungsstelle, die in vivo durch ein proteolytisches Hefeenzym gespalten wird, um- 
fasst; 

LP eine Leader-Peptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

NPRO MHP eine native N-terminale Propeptidsequenz eines interessierenden, reifen, heterologen Saugerpro- 
teins umfasst; 

MHP eine Peptidsequenz fur das interessierende, reife, heterologe Saugerprotein umfasst; 

CPRO MHP eine native C-terminale Propeptidsequenz des interessierenden, reifen, heterologen Saugerprote- 

ins umfasst; und 

n-1 , n-2, n-3 und n-4 unabhangig = 0 oder 1 ; 

wobei die Bearbeitungsstellen eine proteolytische Verarbeitung des Vorlauferpolypeptids in vivo durch eine Hefe- 
Wirtszelle zu dem reifen Protein ermoglichen und wobei zumindest n-3 oder n-4 = 1 . 

2. Nucleotidsequenz nach Anspruch 1, wobei das Saugerprotein ein PDGF-Protein oder ein IGF-Protein oder Vari- 
anten davon ist. 

3. Nucleotidsequenz nach Anspruch 1, wobei das Protein ein humanes Protein ist. 

4. Nucleotidsequenz nach Anspruch 3, wobei das humane PDGF, PDGF-BB oder Varianten davon ist. 

5. Nucleotidsequenz nach Anspruch 4, wobei SP eine Signalpeptidsequenz fur Saccharomyces cerew's/ae-cc-Faktor 
ist. 

6. Nucleotidsequenz nach Anspruch 5, wobei der a-Faktor Mat-a oder Varianten davon ist. 

7. Nucleotidsequenz nach Anspruch 6, wobei n-2 = 1, n-3 = 1 und n-4 = 0. 

8. Nucleotidsequenz nach Anspruch 7, wobei LP eine gestutzte Lederpeptidsequenz ist. 

9. Nucleotidsequenz nach Anspruch 8, wobei die codierende Sequenz fur das Hybrid-Vorlauferpolypeptid die Nu- 
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cleotidsequenz hat, die in SEQ ID NO: 26 angegeben ist. 

10. Nucleotidsequenz nach Anspruch 8, wobei das Hybrid-Vorlauferpolypeptid die Aminosauresequenz hat, die in 
SEQ ID NO: 27 angegeben ist. 

5 

1 1 . Nucleotidsequenz nach Anspruch 3, wobei n-3 = 0 und n-4 = 1 und das humane IGF-Protein IGF-I-A oder Varianten 
davon ist. 

12. Nucleotidsequenz nach Anspruch 11, wobei SP eine Signalpeptidsequenz fur einen Saccharomyces cerevisiae- 
10 a-Faktor ist. 

13. Nucleotidsequenz nach Anspruch 12, wobei der a-Faktor Mat-a oder Varianten davon ist. 

14. Nucleotidsequenz nach Anspruch 13, wobei die codierende Sequenz fur das Hybridvorlauferpolypeptid die Nu- 
15 cleotidsequenz hat, die in SEQ ID NO: 40 angegeben ist. 

15. Nucleotidsequenz nach Anspruch 13, wobei das Hybridvorlauferpolypeptid die Aminosauresequenz, die in SEQ 
ID NO: 41 angegeben ist, hat. 

20 16. Vektor, umfassend eine Nucleotidsequenz, die in 5' 3'-Richtung und funktionell gebunden (a) eine von Hefe 
erkannte Transkriptions- und Translations-lnitiationsregion, (b) eine codierende Sequenz fur ein Hybrid-Vorlaufer- 
polypeptid und (c) eine von Hefe erkannte Transkriptions- and Translations-Terminationsregion umfasst, wobei 
das Hybrid-Vorlauferpolypeptid umfasst: 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3' 

worin SP eine Signalpeptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

30 ps eine bevorzugte Bearbeitungsstelle, die in vivo durch ein proteolytisches Hefeenzym gespalten wird, um- 

fasst; 

LP eine Leader Peptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

NPRO MHP eine native N-terminale Propeptidsequenz eines interessierenden, reifen, heterologen Saugerpro- 
teins umfasst; 

35 MHP eine Peptidsequenz fur das interessierende, reife, heterologe Saugerprotein umfasst; 

CPRO MHP eine native C-terminale Propeptidsequenz des interessierenden reifen, heterologen Saugerprote- 
ins umfasst; und 

n-1 , n-2, n-3 und n-4 unabhangig = 0 oder 1 ; 
40 wobei die Bearbeitungsstellen eine proteolytische Verarbeitung des Vorlauferpolypeptids in vivo durch eine Hefe- 

Wirtszelle zu dem reifen Protein ermoglichen und wobei zumindest n-3 oder n-4 = 1 . 

17. Vektor nach Anspruch 16, wobei der Vektor der Hefe-Shuttle-Vektor pAB24 ist. 

45 18. Hefe-Wirtszelle, die in stabiler Weise mit einer Nucleotidsequenz transformiert ist, die eine Expressionskassette 
umfasst, wobei die Expressionskassette in 5' 3'-Richtung und funktionell gebunden (a) eine von Hefe erkannte 
Transkriptions- und Translations-lnitiationsregion, (b) eine codierende Sequenz fur ein Hybrid-Vorlauferpolypeptid 
und (c) eine von Hefe erkannte Transkriptions- und Translations-Terminationsregion umfasst, wobei das Hybrid- 
Vorlauferpolypeptid umfasst: 

50 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3 I 
worin SP eine Signalpeptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

55 

PS eine bevorzugte Bearbeitungsstelle, die in vivo durch ein proteolytisches Hefeenzym gespalten wird, um- 
fasst; 

LP eine Leader Peptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 
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NPRO MHP eine native N-terminale Propeptidsequenz eines interessierenden, reifen, heterologen Saugerpro- 
teins umfasst; 

MHP eine Peptidsequenz fur das interessierende, reife, heterologe Saugerprotein umfasst; 
CPRO MHP eine native C-terminale Propeptidsequenz des interessierenden reifen, heterologen Saugerprote- 
5 ins umfasst; und 

n-1 , n-2, n-3 und n-4 unabhangig = 0 oder 1 ; 

wobei die Bearbeitungsstellen eine proteolytische Verarbeitung des Vorlauferpolypeptids in vivo durch eine Hefe- 
Wirtszelle zu dem reifen Protein ermoglichen und wobei zumindest n-3 oder n-4 = 1 . 

10 

19. Zelle nach Anspruch 18, wobei die Bearbeitungsstellen Dipeptide sind, die durch KEX2-Genprodukt von Saccha- 
romyces gespalten werden. 

20. Zelle nach Anspruch 19, wobei die Dipeptide 5'-Lys-Arg-3' sind. 

15 

21. Zelle nach Anspruch 20, wobei die Hefezelle zu der Gattung Saccharomyces ge ho rt. 

22. Zelle nach Anspruch 21 , wobei die Hefezelle S. cerevisiae ist. 

20 23. Verfahren zur Expression von heterologen Proteinen und zu ihrer Sekretion in biologisch aktiver reifer Form unter 
Verwendung einer Hefe-Wirtszelle als Expressionssystem, wobei das Verfahren umfasst: Transformieren der He- 
fezelle mit einem Vektor, der eine Nucleotidsequenz umfasst, die in der 5' -» 3'-Richtung und funktionell gebunden 
(a) eine von Hefe erkannte Transkriptions- und Translations-lnitiationsregion, (b) eine codierende Sequenz fur ein 
Hybrid-Vorlauferpolypeptid und (c) eine von Hefe erkannte Transkriptions- und Translations-Terminationsregion 

25 umfasst, wobei das Hybrid-Vorlauferpolypeptid umfasst: 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n .3-MHP-(PS-CPRO MHp ) n . 4 -3' 

30 worin SP eine Signalpeptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 

PS eine bevorzugte Bearbeitungsstelle, die in vivo durch ein proteolytisches Hefeenzym gespalten wird, um- 
fasst; 

LP eine Leader Peptidsequenz fur ein in Hefe sezerniertes Protein umfasst; 
35 NPRO MHP eine native N-terminale Propeptidsequenz eines interessierenden, reifen, heterologen Saugerpro- 

teins umfasst; 

MHP eine Peptidsequenz fur das interessierende, reife, heterologe Saugerprotein umfasst; 

CPRO MHP eine native C-terminale Propeptidsequenz des interessierenden reifen, heterologen Saugerprote- 

ins umfasst; und 

40 

n-1 , n-2, n-3 und n-4 unabhangig = 0 oder 1 ; 

wobei die Bearbeitungsstellen eine proteolytische Verarbeitung des Vorlauferpolypeptids in vivo durch eine Hefe- 
Wirtszelle zu dem reifen Protein ermoglichen und wobei zumindest n-3 oder n-4 = 1 . 

45 

Revendications 

1. Sequence nucleotidique comprenant dans la direction 5' -3' et liees de maniere fonctionnelle (a) une region d'ini- 
tiation de la transcription et de la traduction reconnue dans une levure, (b) une sequence codant pour un polypep- 
50 tide precurseur hybride, et (c) une region de terminaison de la transcription et de la traduction reconnue dans une 

levure, dans laquelle ledit polypeptide precurseur hybride comprend : 

S'-SP-tPSJ^-tLP-PSJ^-tNPROMHp-PSJ^-MHP-tPS-CPROMHpJ^-S- 

55 

dans laquelle : 

SP comprend une sequence de peptide signal d'une proteine secretee par une levure ; 
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20 



25 



PS comprend un site de maturation prefere clive in vivo par une enzyme proteolytique de levure ; 

LP comprend une sequence de peptide leader d'une proteine secretee par une levure ; 

NPRO MHP comprend une sequence propeptidique N-terminale native d'une proteine de mammifere hetero- 

logue mature d'interet ; 

5 MHP comprend une sequence peptidique de ladite proteine de mammifere heterologue mature d'interet ; 

CPRO MHP comprend une sequence propeptidique C-terminale native de ladite proteine de mammifere hete- 
rologue mature d'interet ; et 

n-1 , n-2, n-3, et n-4 independamment = 0 ou 1 ; 
10 dans laquelle lesdits sites de maturation permettent La maturation proteolytique dudit polypeptide precurseur en 

ladite proteine mature in vivo par une cellule de levure hote, et dans laquelle au moins n-3 ou n-4 = 1 . 

2. Sequence nucleotidique de la revendication 1 , dans laquelle ladite proteine de mammifere est une proteine PDGF 
ou une proteine IGF, ou des vatiants de celles-ci. 

15 

3. Sequence nucleotidique de la revendication 2, dans laquelle ladite proteine est une proteine humaine. 

4. Sequence nucleotidique de la revendication 3, dans laquelle ledit PDGF humain est PDGF-BB ou des variants de 
celui-ci. 

5. Sequence nucleotidique de la revendication 4, dans laquelle SP est une sequence de peptide signal d'un facteur 
a de Saccharomyces cerevisiae. 

6. Sequence nucleotidique de la revendication 5, dans laquelle ledit facteur a est Mat a ou des variants de celui-ci. 

7. Sequence nucleotidique de la revendication 6, dans laquelle n-2 = 1 , n-3 = 1 , et n-4 = 0. 

8. Sequence nucleotidique de la revendication 7, dans laquelle LP est une sequence de peptide leader tronquee. 

30 9. Sequence nucleotidique de la revendication 8, dans laquelle ladite sequence codant pour le polypeptide precurseur 
hybride a la sequence nucleotidique indiquee en SEQ ID NO. 26. 

10. Sequence nucleotidique de la revendication 8, dans laquelle ledit polypeptide precurseur hybride a la sequence 
d'acides amines indiquee en SEQ ID NO.27. 

35 

11. Sequence nucleotidique de la revendication 3, dans laquelle n-3 = 0 et n-4 = 1 et ladite proteine IGF humaine est 
IGF-I-A ou des variants de celle-ci. 

12. Sequence nucleotidique de la revendication 11 , dans laquelle SP est une sequence de peptide signal d'un facteur 
40 a de Saccharomyces cerevisiae. 

13. Sequence nucleotidique de la revendication 12, dans laquelle ledit facteur a est Mat a ou des variants de celui-ci. 

14. Sequence nucleotidique de la revendication 13, dans laquelle ladite sequence codant pour ledit polypeptide pre- 
45 curseur hybride a la sequence nucleotidique presentee en SEQ ID NO. 40. 

15. Sequence nucleotidique de la revendication 13, dans laquelle ledit polypeptide precurseur hybride a la sequence 
d'acides amines presentee en SEQ ID NO 41. 

50 16. Vecteur comprenant une sequence nucleotidique qui comprend dans la direction 5' -3' et liees de maniere fonc- 
tionnelle (a) une region d'initiation de la transcription etde la traduction reconnuedans une levure, (b) une sequence 
codant pour un polypeptide precurseur hybride, et (c) une region de terminaison de la transcription et de la tra- 
duction reconnue dans une levure, dans laquelle ledit polypeptide precurseur hybride comprend : 
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S'-SP-fPSJ^-fLP-PSJ^NPROMHp-PSJ^-MHP-fPS-CPROMHpJ^-S' 



dans laquelle : 
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SP comprend une sequence de peptide signal d'une proteine secretee par une levure ; 

PS comprend un site de maturation prefere dive in vivo par une enzyme proteolytique de levure ; 

LP comprend une sequence de peptide leader d'une proteine secretee par une levure ; 

NPRO MHP comprend une sequence propeptidique N-terminale native d'une proteine de mammifere hetero- 

logue mature d'interet ; 

MHP comprend une sequence peptidique de ladite proteine de mammifere heterologue mature d'interet ; 
CPRO MHP comprend une sequence propeptidique C-terminale native de ladite proteine de mammifere hete- 
rologue mature d'interet ; et 

n-1 , n-2, n-3, et n-4 independamment = 0 ou 1 ; 
dans laquelle lesdits sites de maturation permettent la maturation proteolytique dudit polypeptide precurseur en 
ladite proteine mature in vivo par une cellule de levure hote, et dans laquelle au moins n-3 ou n-4 = 1 . 

17. Vecteur de la revendication 16, dans laquelle ledit vecteur est le vecteur navette de levure pAB24. 

18. Cellule de levure hote transformer de maniere stable avec une sequence nucleotidique comprenant une cassette 
d'expression, ladite cassette comprenant dans la direction 5' - 3' et liees de maniere fonctionnelle (a) une region 
d'initiation de la transcription et de la traduction reconnue dans une levure, (b) une sequence codant pour un 
polypeptide precurseur hybride, et (c) une region de terminaison de la transcription et de la traduction reconnue 
dans une levure, dans laquelle ledit polypeptide precurseur hybride comprend : 

5'-SP-(PS) n . 1 -(LP-PS) n . 2 -(NPRO MH p-PS) n . 3 -MHP-(PS-CPRO MH p) n . 4 -3' 

dans laquelle : 

SP comprend une sequence de peptide signal d'une proteine secretee par une levure ; 
PS comprend un site de maturation prefere dive in vivo par une enzyme proteolytique de levure ; 
LP comprend une sequence de peptide leader d'une proteine secretee par une levure ; 
NPRO MHP comprend une sequence propeptidique N-terminale native d'une proteine de mammifere hetero- 
logue mature d'interet ; 

MHP comprend une sequence peptidique de ladite proteine de mammifere heterologue mature d'interet ; 
CPRO MHP comprend une sequence propeptidique C-terminale native de ladite proteine de mammifere hete- 
rologue mature d'interet ; et 

n-1 , n-2, n-3, et n-4 independamment = 0 ou 1 ; 
dans laquelle lesdits sites de maturation permettent la maturation proteolytique dudit polypeptide precurseur en 
ladite proteine mature in vivo par une cellule de levure hote, et dans laquelle au moins n-3 ou n-4 = 1 . 

19. Cellule de la revendication 18, dans laquelle lesdits sites de maturation sont des dipeptides dives par le produit 
du gene KEX2 de Saccharomyces. 

20. Cellule de la revendication 19, dans laquelle lesdits dipeptides sont 5'-Lys-Arg-3'. 

21. Cellule de la revendication 20, dans laquelle ladite cellule de levure est du genre Saccharomyces. 

22. Cellule de la revendication 21 , dans laquelle ladite cellule de levure est S. cerevisiae. 

23. Methode d'expression de proteines heterologues et leur secretion dans la forme mature biologiquement active 
utilisant une cellule hote de levure comme systeme d'expression, ladite methode comprenant le fait de transformer 
ladite cellule de levure avec un vecteur comprenant une sequence nucleotidique qui comprend dans la direction 
5' - 3' et liees de maniere fonctionnelle (a) une region d'initiation de la transcription et de la traduction reconnue 
dans une levure, (b) une sequence codant pour un polypeptide precurseur hybride, et (c) une region de terminaison 
de la transcription et de la traduction reconnue dans une levure, dans laquelle ledit polypeptide precurseur hybride 
comprend : 

S'-SP-fPSJ^^LP-PSJ^-tNPROMHp-PSJ^MHP-tPS-CPROMHpJ^-S' 
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dans laquelle : 

SP comprend une sequence de peptide signal d'une proteine secretee par une levure ; 

PS comprend un site de maturation prefere clive in vivo par une enzyme proteolytique de levure ; 

LP comprend une sequence de peptide leader d'une proteine secretee par une levure ; 

NPRO MHP comprend une sequence propeptidique N-terminale native d'une proteine de mammifere hetero- 

logue mature d'interet ; 

MHP comprend une sequence peptidique de ladite proteine de mammifere heterologue mature d'interet ; 
CPRO MHP comprend une sequence propeptidique C-terminale native de ladite proteine de mammifere hete- 
rologue mature d'interet ; et 

n-1 , n-2, n-3, et n-4 independamment = 0 ou 1 ; 
dans laquelle lesdits sites de maturation permettent la maturation proteolytique dudit polypeptide precurseur en 
ladite proteine mature in vivo par une cellule de levure hote, et dans laquelle au moins n-3 ou n-4 = 1 . 
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